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(54) Title: METHODS AND COMPOSITIONS FOR DIAGNOSING DYSPLASIA 

(57) Abstract: Methods and compositions are disclosed for detecting dysplasia in a tissue sample, screening candidate compounds 
for the ability to inhibit growth of a cancer cell, predicting predisposition to adenocarcinoma and treating cancer based on gene 
expression profiles. 
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5 

METHODS AND COMPOSITIONS FOR DETECTING DYSPLASIA 

TECHNICAL FIELD 

10 

The present invention relates to nucleic acid sequences, and compositions and uses 
therefore, which have been shown to be differentially expressed in high-grade dysplasia and 
which are useful as markers for the detection of high-grade dysplasia in a patient, and are 
implicated in the development of adenocarcinoma. 

15 

BACKGROUND OF THE INVENTION 

The incidence of esophageal adenocarcinoma is rising in Western Countries, replacing 
squamous cell carcinoma as the most common neoplasm of the esophagus in white males and 
increasing in other ethnic groups (Devesa et al., Cancer 83:2049-2053 (1998); and 

20 Bollschweiler et al., Cancer 92:549-555 (2001)). Barrett's esophagus (BE) is the primary 
recognized risk factor for esophageal adenocarcinoma. BE results from repeated injury to the 
esophageal mucosa and develops in a subset of patients with chronic gastrointestinal reflux 
disease. It is characterized by a metaplastic change of squamous esophageal epithelium to 
intestinalized columnar mucosa (Csendes et al., Dis. Esoph 13:5-11 (2000); Cameron et al., 

25 New Eng. J. Med. 313:857-859 (1985); and Drewitz et al., Amer. J. Gastroenterol 92:212-215 
(1997)). 

Barrett's esophagus is found in 6% -16% of patients undergoing upper gastrointestinal 
endoscopy for gastroesophageal reflux, and it is estimated that a substantial patient population 
30 remains undiagnosed (Sarr et al., Amer. J. Surgery 149:187-193 (1985); Winters et al., 
Gastroenterology 92:118-124 (1985); Cameron et al., Gastroenterology 99:918-922 (1990); 
and Cameron et al., Gastroenterology 103:1241-1245 (1992)). The risk of developing 
esophageal carcinoma is 30 - 150 times greater in patients with BE. The outlook for patients 
diagnosed with adenocarcinoma is poor, with a 5 year survival rate of 10 - 15% (Streitz et al., 

1 
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Ann. Surg. 213:122-125 (1991); Menke-Pluymers et al., Gut 33:1454-1458 (1992); and Lerut 
et al., J. Thorac. Cariovasc. Surg. 107:1059-1066 (1994)). Patients with BE are placed on 
surveillance programs, although the absolute risk of developing adenocarcinoma in the context 
of BE remains relatively low, estimated at approximately 0.5% per patient year (Drewitz et al., 
5 Amer. J. Gastroenterol 92:212-215; O'Connor et al., Am. J. Gastroenterol 94:2037-2042 

(1999) ; Spechler et al., JAMA 285:2331-2338 (2001); and Shaheen et al., Gastroenterology 
119:333-338 (2000)). The value and cost-effectiveness of surveillance programs continue to 
be debated due to lack of understanding of the natural history of BE, the difficulty in obtaining 
representative biopsies by random sampling due to the heterogeneous nature of intestinal 

10 metaplasia, and inter-observer variability in endoscopic and histopathologic diagnosis (Falk, 
Gastroenterology 122:1569-1591 (2002); Sampliner, Am. J Gastroenterol. 93:1028-1032 
(1998); and Alikhan et al., Gastrointest. Endosc. 50:23-26 (1999)). A metaplasia-dysplasia- 
carcinoma sequence has been described for BE and genetic changes involving cell cycle 
abnormalities, DNA ploidy, mutations, and amplification and expression of oncogenes have 

15 been identified (al-Kasspooles et al., Internat. J. Cancer 54:213-219 (1993); Vissers et al., 
Anticancer Res. 21:3813-3820 (2001); Bani-Hani et al., J. Natl. Cancer Inst. 92:1316-1321 

(2000) ; Walch et al., Am. J. Pathol. 156:555-566 (2000); Wong et al., Cancer Res. 61:8284- 
8289 (2001); and Romagnoli et al., Laboratory Investigation 81:241-247 (2001)). There is a 
need for reliable detection of high-grade dysplasia and diagnosis of patients, such as BE 

20 patients, likely to develop adenocarcinoma, thereby allowing the disease to be monitored and 
treated early in its progression. 

SUMMARY OF THE INVENTION 

25 Generally, the present invention is based on the discovery that it is possible to detect 

high-grade dysplasia in a patient suspected of experiencing dysplasia, such as dysplasia 
associated with gastrointestinal reflux disease, such as Barrett's esophagus, or colon tissue 
dysplasia, by determining expression is an esophageal or colon biopsy from the patient 
wherein at least eight genes selected from a group of genes are expressed at a level of at least 

30 1.5 fold over expression in a control sample. The control sample may comprise an esophageal 
or colon biopsy from a normal patient (i.e. one not experiencing gastrointestinal reflux 
disease) or from pooled samples of normal epithelial tissue (such as from normal liver, lung 
and kidney tissue). The group of high-grade dysplasia (HGD) gene markers, and their 
encoded polypeptides, comprise ET-1 (endothelin-1, NMJ)01955) (SEQ ID NO:l or 2); 
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AGR2 (anterior gradient 2 (Xenepus laevis) homolog, NM_006408) (SEQ ID NO:3 or 4); 
ADAMS (NM_001109) (SEQ ID NO:5 or 6); PRSS8 (Prostasin precursor, serine protease, 
NM_002773) (SEQ ID NO:7 or 8); AXOl (Axonin-1 precursor, NM_005076) (SEQ ID NO:9 
or 10); NROB2 (Nuclear hormone receptor, NM_021969) (SEQ ID NO: 11 or 12); TM7SF1 
5 (NM_003272) (SEQ ID NO: 13 or 14); DLDH (dihydrolipamide dehydrogenase, NMJ300108) 
(SEQ ID NOS:15 or 16); MAT2B (methionine adenosyltransferase E, beta, NM_013283) 
(SEQ ID NO: 17 or 18); STC-2 (stanniocalcin-2, NM_003714) (SEQ ED NO: 19 or 20); PPBI 
(alkaline phosphatase, intestinal precursor, NM_001631) (SEQ ID NO:21 or 22); SLNAC1 
(sodium channel receptor SLNAC1, NM_004769) (SEQ ID NO:23 or 24); CAH4 (carbonic 

10 anhydrase iv precursor, NM_000717) (SEQ ID NO:25 or 26); PA21 (phopholipase a2 
precursor, NM_000928) (SEQ ID NO:27 or 28); PAR2 (proteinase activated receptor 2 
precursor, NM_005242) (SEQ ID NO:29 or 30); IDE (insulin-degrading enzyme, 
NM_004969) (SEQ ID NO:31 or 32); MYOIA (myosin-lA, NM_005379) (SEQ ID NO:33 or 
34); CYP2J2 (cytochrome P450 monooxygenase, NM_000775) (SEQ ID NO:35 or 36); 

15 PHYH (phytanoyl-CoA-hydroxylase (Refsum disease), NM_006214) (SEQ ID NO:37 or 38); 
CYB5 (cytochrome b5, 3' end, NM_001914) (SEQ ID NO:39 or 40); COXVIb (coxVIb gene, 
last exon and flanking sequence, NM_001863) (SEQ ID NO:41 or 42); and TCF4 
(NM_030756) (SEQ ID NO:43 or 44). HGD marker polypeptides refer to the polypeptides 
encoded by the HGD gene markers. 

20 

In an aspect, the invention involves a method for the diagnosis of esophageal high- 
grade dysplasia (HGD) in a patient, comprising establishing increased expression of at least 
eight genes (listed here with the polypeptide encoded by the gene) selected from the group 
consisting of ET-1 (endothelin-1, NM_001955) (SEQ ID NO:l or 2); AGR2 (anterior gradient 

25 2 (Xenepus laevis) homolog, NM 006408) (SEQ ID NO:3 or 4); ADAMS (NM_001109) 
(SEQ ID NO:5 or 6); PRSS8 (Prostasin precursor, serine protease, NM_002773) (SEQ ID 
NO:7 or 8); AXOl (Axonin-1 precursor, NM_005076) (SEQ ID NO:9 or 10); NROB2 
(Nuclear hormone receptor, NM_021969) (SEQ ID NO: 11 or 12); TM7SF1 (NM_003272) 
(SEQ ID NO: 13 or 14); DLDH (dihydrolipamide dehydrogenase, NM_000108) (SEQ ID 

30 NOS:15 or 16); MAT2B (methionine adenosyltransferase II, beta, NMJ313283) (SEQ ID 
NO:17 or 18); STC-2 (stanniocalcin-2, NM_003714) (SEQ ED NO:19 or 20); PPBI (alkaline 
phosphatase, intestinal precursor, NM_001631) (SEQ ID NO:21 or 22); SLNAC1 (sodium 
channel receptor SLNAC1, NM_004769) (SEQ ID NO:23 or 24); CAH4 (carbonic anhydrase 
iv precursor, NM_000717) (SEQ ED NO:25 or 26); PA21 (phopholipase a2 precursor, 
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NM_000928) (SEQ ID NO:27 or 28); PAR2 (proteinase activated receptor 2 precursor, 
NM_005242) (SEQ ID NO:29 or 30); IDE (insulin-degrading enzyme, NM_004969) (SEQ ID 
NO:31 or 32); MYOIA (myosin-lA, NM_005379) (SEQ ID NO:33 or 34); CYP2J2 
(cytochrome P450 monooxygenase, NM_000775) (SEQ ID NO:35 or 36); PHYH (phytanoyl- 
5 CoA-hydroxylase (Refsum disease), NM_006214) (SEQ ID NO:37 or 38); CYB5 (cytochrome 
b5, 3' end, NM_001914) (SEQ ID NO:39 or 40); COXVIb (coxVIb gene, last exon and 
flanking sequence, NM_001863) (SEQ ID NO:41 or 42); and TCF4 (NM_030756) (SEQ ID 
NO:43 or 44); and comparing expression of the genes to a baseline expression of the genes in 
normal tissue controls; wherein an increase of at least 1.5-fold in expression (and/or p value < 
10 0/07) of the genes from the group relative to the baseline indicates that the patient is 
experiencing esophageal high-grade dysplasia. In an embodiment of the invention, the tissue 
is human tissue. 

In another embodiment, the invention involves a method of identifying a patient 

15 susceptable to esophageal adenocarcoma, comprising diagnosing esophageal high-grade 
dysplasia in a patient by establishing increased expression of at least eight genes selected from 
the group consisting of ET-1 (endothelin-1, NM_001955) (SEQ ID NO:l); AGR2 (anterior 
gradient 2 (Xenepus laevis) homolog, NM_006408) (SEQ ID NO:3); ADAM8 (NM_001 109) 
(SEQ ID NO:5); PRSS8 (Prostasin precursor, serine protease, NM_002773) (SEQ ID NO:7); 

20 AXOl (Axonin-1 precursor, NM_005076) (SEQ ID NO:9); NROB2 (Nuclear hormone 
receptor, NM_021969) (SEQ ID NO: 11); TM7SF1 (NM_003272) (SEQ ID NO: 13); DLDH 
(dihydrolipamide dehydrogenase, NM_000108) (SEQ ID NOS:15); MAT2B (methionine 
adenosyltransferase II, beta, NM_013283) (SEQ ID NO: 17); STC-2 (stanniocalcin-2, 
NM_003714) (SEQ ID NO: 19); PPBI (alkaline phosphatase, intestinal precursor, 

25 NM_001631) (SEQ ID NO:21); SLNAC1 (sodium channel receptor SLNAC1, NM_004769) 
(SEQ ID NO:23); CAH4 (carbonic anhydrase iv precursor, NM_000717) (SEQ ID NO:25); 
PA21 (phopholipase a2 precursor, NM_000928) (SEQ ID NO:27); PAR2 (proteinase activated 
receptor 2 precursor, NM_005242) (SEQ ID NO:29); IDE (insulin-degrading enzyme, 
NM_004969) (SEQ ID NO:31); MYOIA (myosin-lA, NM_005379) (SEQ ID NO:33); 

30 CYP2J2 (cytochrome P450 monooxygenase, NM_000775) (SEQ ID NO:35); PHYH 
(phytanoyl-CoA-hydroxylase (Refsum disease), NM_006214) (SEQ ID NO:37); CYB5 
(cytochrome b5, 3' end, NM_001914) (SEQ ID NO:39); COXVIb (coxVIb gene, last exon 
and flanking sequence, NM_001863) (SEQ ID NO:41); and TCF4 (NM_030756) (SEQ ID 
NO:43); and comparing expression of the genes to a baseline expression of the genes in 
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normal tissue controls; wherein an increase of at least 1.5-fold in expression of the genes from 
the group relative to the baseline indicates that the patient is experiencing esophageal high- 
grade dysplasia. Alternatively, the patient may be susceptible to colon carcinoma and the 
diagnosing of high-grade dysplasia is by similarly determining expression of at least eight 
5 genes of the above group in a test colon tissue sample compared to a normal colon tissue 
sample. 

In still another embodiment, the invention involves a method for determining whether 
an esophageal tissue is predisposed to a neo-plastic transformation, comprising determining 

10 whether in a cell from the esophageal tissue at least eight nucleic acid sequences selected from 
the group consisting of ET-1 (endothelin-1, NM_001955) (SEQ ID NO:l); AGR2 (anterior 
gradient 2 (Xenepus laevis) homolog, NM„006408) (SEQ ID NO:3); ADAM8 (NM_001109) 
(SEQ ID NO:5); PRSS8 (Prostasin precursor, serine protease, NM_002773) (SEQ ID NO:7); 
AXOl (Axonin-1 precursor, NM_005076) (SEQ ID NO:9); NROB2 (Nuclear hormone 

15 receptor, NM_021969) (SEQ ID NO: 11); TM7SF1 (NMJ)03272) (SEQ ID NO: 13); DLDH 
(dihydrolipamide dehydrogenase, NM_000108) (SEQ ID NOS:15); MAT2B (methionine 
adenosyltransferase II, beta, NM_013283) (SEQ ID NO:17); STC-2 (stanniocalcin-2, 
NM_003714) (SEQ ID NO: 19); PPBI (alkaline phosphatase, intestinal precursor, 
NM_001631) (SEQ ID NO:21); SLNAC1 (sodium channel receptor SLNAC1, NM_004769) 

20 (SEQ ID NO:23); CAH4 (carbonic anhydrase iv precursor, NM_000717) (SEQ ID NO:25); 
PA21 (phopholipase a2 precursor, NM_000928) (SEQ ID NO:27); PAR2 (proteinase activated 
receptor 2 precursor, NM_005242) (SEQ ID NO:29); IDE (insulin-degrading enzyme, 
NM_004969) (SEQ ID NO:31); MYOIA (myosin-lA, NMJ)05379) (SEQ ID NO:33); 
CYP2J2 (cytochrome P450 monooxygenase, NMJ)00775) (SEQ ID NO:35); PHYH 

25 (phytanoyl-CoA-hydroxylase (Refsum disease), NMJ306214) (SEQ ID NO:37); CYB5 
(cytochrome b5, 3 5 end, NM_001914) (SEQ ID NO:39); COXVIb (coxVIb gene, last exon 
and flanking sequence, NM_001863) (SEQ ID NO:41); and TCF4 (NM_030756) (SEQ ID 
NO:43) is expressed at least 1.5-fold above baseline expression in a normal tissue control. In 
an embodiment, the tissue is human tissue. 

30 

In another aspect, the invention involves a method for the diagnosis of esophageal 
high-grade dysplasia in a patient, comprising establishing the level of expression a polypeptide 
encoded by at least eight genes selected from the group consisting of ET-1 (endothelin-1, 
NJVL001955) (SEQ ID NO:l); AGR2 (anterior gradient 2 (Xenepus laevis) homolog, 
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NM_006408) (SEQ ID NO:3); ADAM8 (NM_001109) (SEQ ID NO:5); PRSS8 (Prostasin 
precursor, serine protease, NM„002773) (SEQ ID NO:7); AXOl (Axonin-1 precursor, 
NM_005076) (SEQ ID NO:9); NROB2 (Nuclear hormone receptor, NMJ321969) (SEQ ID 
NO: 11); TM7SF1 (NM_003272) (SEQ ID NO: 13); DLDH (dihydrolipamide dehydrogenase, 
5 NM_000108) (SEQ ID NOS:15); MAT2B (methionine adenosyltransferase n, beta, 
NM_013283) (SEQ ID NO: 17); STC-2 (stanniocalcin-2, NM_003714) (SEQ ID NO: 19); 
PPBI (alkaline phosphatase, intestinal precursor, NM_001631) (SEQ ID NO:21); SLNAC1 
(sodium channel receptor SLNAC1, NMJ)04769) (SEQ ID NO:23); CAH4 (carbonic 
anhydrase iv precursor, NMJ)00717) (SEQ ID NO:25); PA21 (phopholipase a2 precursor, 

10 NM„000928) (SEQ ID NO:27); PAR2 (proteinase activated receptor 2 precursor, 
NM_005242) (SEQ ID NO:29); IDE (insulin-degrading enzyme, NM_004969) (SEQ ID 
NO:31); MYOIA (myosin-lA, NM_005379) (SEQ ID NO:33); CYP2J2 (cytochrome P450 
monooxygenase, NM_000775) (SEQ ID NO:35); PHYH (phytanoyl-CoA-hydroxylase 
(Refsum disease), NM_006214) (SEQ ID NO:37); CYB5 (cytochrome b5, 3' end, 

15 NM_001914) (SEQ ID NO:39); COXVIb (coxVIb gene, last exon and flanking sequence, 
NM__001863) (SEQ ID NO:41); and TCF4 (NM_030756) (SEQ ID NO:43); and comparing 
expression of the at least eight genes from the group to a baseline expression of the genes in 
normal tissue controls; wherein an increase of at least 1. 5-fold in expression of the polypeptide 
encoded by the genes from the group relative to the baseline indicates that the patient has 

20 esophageal dysplasia. 

In an embodiment, the method involves contacting a HGD cell or a cancer cell with an 
antibody that binds specifically to a polypeptide, or fragment thereof, encoded by a gene 
selected from the group of HGD marker genes or cancer marker genes as disclosed herein. 

25 

In an embodiment, the method involves determining expression of at least 8 of the 
genes of the group of HGD marker genes using by nucleic acid miroarray analysis. In further 
embodiment, the microarray comprises nucleic acid sequences of at least 20 nucleotides 
derived from at least eight of the genes from the following group: ET-1 (endothelin-1, 
30 NMJ)01955) (SEQ ID NO:l); AGR2 (anterior gradient 2 (Xenepus laevis) homolog, 
NM_006408) (SEQ ID NO:3); ADAM8 (NM_001109) (SEQ ID NO:5); PRSS8 (Prostasin 
precursor, serine protease, NM_002773) (SEQ ID NO:7); AXOl (Axonin-1 precursor, 
NM_005076) (SEQ ID NO:9); NROB2 (Nuclear hormone receptor, NMJ)21969) (SEQ ID 
NO: 11); TM7SF1 (NM_003272) (SEQ ID NO: 13); DLDH (dihydrolipamide dehydrogenase, 
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NM_000108) (SEQ ID NOS:15); MAT2B (methionine adenosyltransferase II, beta, 
NM_013283) (SEQ ID NO: 17); STC-2 (stanniocalcin-2, NM_003714) (SEQ ID NO: 19); 
PPBI (alkaline phosphatase, intestinal precursor, NM_001631) (SEQ ID NO:21); SLNAC1 
(sodium channel receptor SLNAC1, NM_004769) (SEQ ID NO:23); CAH4 (carbonic 
anhydrase iv precursor, NM_000717) (SEQ ID NO:25); PA21 (phopholipase a2 precursor, 
NM_000928) (SEQ ID NO:27); PAR2 (proteinase activated receptor 2 precursor, 
NM_005242) (SEQ ID NO:29); IDE (insulin-degrading enzyme, NM_004969) (SEQ ID 
NO:31); MYOIA (myosin-lA, NM_005379) (SEQ ID NO:33); CYP2J2 (cytochrome P450 
monooxygenase, NM_000775) (SEQ ID NO:35); PHYH (phytanoyl-CoA-hydroxylase 
(Refsum disease), NM_006214) (SEQ ID NO:37); CYB5 (cytochrome b5, 3' end, 
NM_001914) (SEQ ID NO:39); COXVIb (coxVIb gene, last exon and flanking sequence, 
NM_001863) (SEQ ID NO:41); and TCF4 (NM_030756) (SEQ ID NO:43). 

In another embodiment, the invention involves analysis using a microarray comprising 
nucleic acid probe sequences comprising at least 20 contiguous nucleotides from at least 8 
genes selected from the group of HGD marker genes: ET-1 (endothelin-1, NM_001955) (SEQ 
ID NO:l); AGR2 (anterior gradient 2 (Xenepus laevis) homolog, NM_006408) (SEQ ID 
NO:3); ADAM8 (NM_001109) (SEQ ID NO:5); PRSS8 (Prostasin precursor, serine protease, 
NM_002773) (SEQ ID NO:7); AXOl (Axonin-1 precursor, NM_005076) (SEQ ID NO:9); 
NROB2 (Nuclear hormone receptor, NM_021969) (SEQ ID NO: 11); TM7SF1 (NM_003272) 
(SEQ ID NO: 13); DLDH (dihydrolipamide dehydrogenase, NM_000108) (SEQ ID NOS:15); 
MAT2B (methionine adenosyltransferase II, beta, NM_013283) (SEQ ID NO: 17); STC-2 
(stanniocalcin-2, NM_003714) (SEQ ID NO: 19); PPBI (alkaline phosphatase, intestinal 
precursor, NM_001631) (SEQ ID NO:21); SLNAC1 (sodium channel receptor SLNAC1, 
NM_004769) (SEQ ID NO:23); CAH4 (carbonic anhydrase iv precursor, NM_000717) (SEQ 
ID NO:25); PA21 (phopholipase a2 precursor, NM_000928) (SEQ ID NO:27); PAR2 
(proteinase activated receptor 2 precursor, NM_005242) (SEQ ID NO:29); IDE (insulin- 
degrading enzyme, NM_004969) (SEQ ID NO:31); MYOIA (myosin-lA, NM_005379) (SEQ 
ID NO:33); CYP2J2 (cytochrome P450 monooxygenase, NM_000775) (SEQ ID NO:35); 
PHYH (phytanoyl-CoA-hydroxylase (Refsum disease), NM_006214) (SEQ ID NO:37); CYB5 
(cytochrome b5, 3' end, NM_001914) (SEQ ID NO:39); COXVIb (coxVIb gene, last exon 
and flanking sequence, NM_001863) (SEQ ID NO:41); and TCF4 (NM_030756) (SEQ ID 
NO:43). 
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In a further embodiment, the methods of detecting high-grade dysplasia, diagnosing 
high-grade dysplasia, or prognosing development of cancer from detected high-grade 
dysplasia involves determining expression of at least eight genes from the group of HGD 
markers disclosed herein above as determined by an analysis method including, but not limited 
5 to polymerase chain reaction analysis, real-time polymerase chain reaction analysis, Taqman® 
polymerase chain reaction analysis, nucleic acid hybridization, fluorescent in situ 
hybridization and non-fluorescent in situ hybridization (e.g. radioactive, calorimetric, 
enzymatic or enzyme-linked detection methods for in situ hybridization). Where the method 
of the invention involves determining increased expression of polypeptides encoded by at least 
10 eight HGD marker genes as disclosed herein above, an embodiment of the method involves 
analysis using an antibody capable of specifically binding to a polypeptide, or a fragment 
thereof, encoded by a HGD marker gene. 

In an alternative embodiment, the analytical methods of the invention involve probes 
15 or targets labelled with radionuclides or enzymatic labels such that expression of a gene or 
polypeptide is determinable. 

In an embodiment of any of the methods or compositions of the invention, the 
dysplasia is high-grade dysplasia of esophagus tissue and the cancer is esophageal 
20 adenocarcinoma. Alternatively the patient is a human patient. 

In another aspect, the invention involves a method of treating high-grade esophageal 
dysplasia or inhibiting or preventing cancer in a patient in need of such treatment, the method 
comprising administering to the patient a compound capable of decreasing expression of a 

25 gene selected from the group consisting of ET-1 (endothelin-1, NML001955) (SEQ ID NO:l); 
AGR2 (anterior gradient 2 (Xenepus laevis) homolog, NM_006408) (SEQ ID NO:3); ADAMS 
(NM_001109) (SEQ ID NO:5); PRSS8 (Prostasin precursor, serine protease, NM_002773) 
(SEQ ID NO:7); AXOl (Axonin-1 precursor, NM_005076) (SEQ ID NO:9); NROB2 
(Nuclear hormone receptor, NM_021969) (SEQ ID NO: 11); TM7SF1 (NM_003272) (SEQ ID 

30 NO: 13); DLDH (dihydrolipamide dehydrogenase, NM_000108) (SEQ ID NOS:15); MAT2B 
(methionine adenosyltransferase II, beta, NMJH3283) (SEQ ID NO: 17); STC-2 
(stanniocalcin-2, NMJ303714) (SEQ ID NO: 19); PPBI (alkaline phosphatase, intestinal 
precursor, NM_001631) (SEQ ID NO:21); SLNAC1 (sodium channel receptor SLNAC1, 
NML004769) (SEQ ID NO:23); CAH4 (carbonic anhydrase iv precursor, NM_000717) (SEQ 

8 



WO 2004/044178 



PCT/US2003/036260 



ID NO:25); PA21 (phopholipase a2 precursor, NM_000928) (SEQ ID NO:27); PAR2 
(proteinase activated receptor 2 precursor, NM_005242) (SEQ ID NO:29); IDE (insulin- 
degrading enzyme, NMJ304969) (SEQ ID NO:31); MYOIA (myosin-lA, NM_005379) (SEQ 
ID NO:33); CYP2J2 (cytochrome P450 monooxygenase, NM_000775) (SEQ ID NO:35); 
5 PHYH (phytanoyl-CoA-hydroxylase (Refsum disease), NM_006214) (SEQ ID NO:37); CYB5 
(cytochrome b5, 3' end, NMJ301914) (SEQ ID NO:39); COXVIb (coxVIb gene, last exon 
and flanking sequence, NM_001863) (SEQ ID NO:41); and TCF4 (NM_030756) (SEQ ID 
NO:43) . 

10 In still another aspect, the invention involves a method of treating high-grade 

esophageal dysplasia or inhibiting or preventing cancer in a patient in need of such treatment, 
the method comprising administering to the patient a compound capable of decreasing 
expression of a polypeptide encoded by a gene selected from the HGD marker genes. 

15 In still another aspect, the invention involves a method of treating high-grade 

esophageal dysplasia or inhibiting or preventing cancer in a patient in need of such treatment, 
the method comprising administering to the patient a compound capable of inhibiting activity 
of a polypeptide encoded by a gene which is one of at least eight genes selected from the 
group of HGD marker genes as disclosed herein. In an embodiment, the compound is an 

20 antagonist of the polypeptide. In a further embodiment, the antagonist is an antibody, such as 
a monoclonal antibody or a humanized monoclonal antibody. 

In a further aspect, the invention involves a method of screening for candidate drugs 
which inhibits or prevents progression from dysplasia to adenocarcinoma, the method 
25 comprising contacting a cell with a candidate drug, and assaying inhibition of progression 
from high-grade dysplasia to cancer in the cell, wherein the cell, prior to contacting with the 
candidate drug, expresses at least eight genes at a level at least 1. 5-fold increased relative to a 
normal tissue baseline level, wherein the genes are selected from group of HGD marker genes 
as disclosed herein. 

30 

In another aspect, the invention involves a method of inhibiting or preventing 
progression from high-grade dysplasia to cancer in a patient by administering a drug identified 
by screening for candidate drugs which inhibits or prevents progression from dysplasia to 
adenocarcinoma, the method comprising contacting a cell with a candidate drug, and assaying 
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inhibition of progression from high-grade dysplasia to cancer in the cell, wherein the cell, 
prior to contacting with the candidate drug, expresses at least eight genes at a level at least 1.5- 
fold increased relative to a normal tissue baseline level, wherein the genes are selected from 
group of HGD marker genes as disclosed herein. 

5 

In another aspect, the invention involves a compound capable of inhibiting or 
preventing the progression from high-grade dysplasia to cancer in a patient. In an embodiment 
of the invention the compound is identified by screening for a candidate drug which inhibits or 
prevents progression from dysplasia to adenocarcinoma, the method comprising contacting a 

10 cell expressing at least 1.5-fold relative to a normal tissue baseline level at least eight genes 
selected from the group of HGD marker genes as disclosed herein, with a candidate drug, and 
assaying inhibition of progression from high-grade dysplasia to cancer in the cell. In an 
embodiment, the invention involves a pharmaceutical composition comprising a compound 
capable of inhibiting or preventing the progression from high-grade dysplasia to cancer in a 

15 patient, and a pharmaceutically acceptable carrier. 

In still another aspect, the invention involves detecting cancer in a patient by 
determining that a gene, or the polypeptide it encodes, selected from the group consisting of 
CAD17 (liver-intestine cadherin, NM_004063) (SEQ ID NO:45 or 46), CLDN15 (claudin 15, 

20 NM_014343) (SEQ ID NO:47 or 48), SLNAC1 (sodium channel, NMJ304769) (SEQ ID 
NO:23 or 24), CFTR (chloride channel, NM_000492) (SEQ ID NO:49 or 50), H2R (histamine 
H2 receptor, NM_022304) (SEQ ID NO:51 or 52), PRSS8 (serine protease, NMJ302773) 
(SEQ ID NO:7 or 8), PA21 (phospholipase A2 group IB, NM_000928) (SEQ ID NO:27 or 
28), AGR2 (anterior gradient 2 homolog, (NM„006408) (SEQ ID NO:3 or 4), EGFR 

25 (NM_005228) (SEQ ID NO:53 or 54), EPHB2 (NM_004442) (SEQ ID NO:55 or 56), 
CRIPTO CR-1 (NM_003212) (SEQ ID NO:57 or 58), Eprin Bl (NM_004429) (SEQ ID 
NO:59 or 60), MMP- 1 7/MT4-MMP (NM_016155) (SEQ ID NO:61 or 62), MMP26 
(NM_021801) (SEQ ID NO:63 or 64), ADAM 10 (NM_001110) (SEQ ID NO:65 or 66), 
ADAM8 (NM.001109) (SEQ ID NO:5 or 6), AD AMI (XM_132370) (SEQ ID NO:67 or 68), 

30 T1M1 (NM_003254) (SEQ ID NO:69 or 70), MUC1 (XM_053256) (SEQ ID NO:71 or 72), 
CEA (NM_004363) (SEQ ID NO:73 or 74), NCA (NM_002483) (SEQ ID NO:75 or 76), 
Follistatin (NMJ306350) (SEQ ID NO:77 or 78), Claudin 1 (NM_021 101) (SEQ ID NO:79 or 
80), Claudin 14 (NM_012130) (SEQ ID NO:81 or 82), tenascin-R (NM_003285) (SEQ ID 
NO:83 or 84), CAD3 (NM_001793) (SEQ ID NO:85 or 86), AXOl (NM_005076) (SEQ ID 
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NO:9 or 10), CONT (NM_001843) (SEQ ID NO: 87 or 88), Osteopontin (NM_000582) (SEQ 
ID NO:89 or 90), Galectin 8 (NM_006499) (SEQ ID NO:91 or 92), PGS1 (bihlycan, 
NM_001711) (SEQ ID NO:93 or 94), Frizzled 2 (NM_001466) (SEQ ID NO:95 or 96), ISLR 
(NM_005545) (SEQ ID NO:97 or 98), FLJ23399 (NM_022763) (SEQ ID NO:99 or 100), 

5 TEM1 (NM_020404) (SEQ ID NO: 101 or 102), Tie2 ligand2 (NM_001 147) (SEQ ID NO: 103 
or 104), STC-2 (NM_003714) (SEQ ID NO: 19 or 20), VEGFC (NM_005429) (SEQ ID 
NO: 105 or 106), tPA (NM_OO0930) (SEQ ID NO: 107 or 108), Endothelin 1 (NM_001955) 
(SEQ ID NO:l or 2), Thrombomodulin (NM_000361) (SEQ ID NO: 109 or 110), TF 
(NM_001993) (SEQ ID NO:lll or 112), GPR4 (NM_005282) (SEQ ID NO: 113 or 114), 

10 GPR66 (NM_006056) (SEQ ID NO: 1 15 or 1 16), SLC22A2 (NM_003058) ((SEQ ID NO: 1 17 
or 118), MLSN1 (NM_002420) (SEQ ID NO: 119 or 120), and ATN2 (Na/K transport, 
NM_000702) (SEQ ID NO: 121 or 122) is expressed at a level of about 1.5-fold in a test 
sample above the level of expression in a normal tissue sample of the same tissue type. The 
test sample is generally from a patient suspected of experiencing cancer, including, but not 

15 limited to, adenocarcinoma, esophageal adenocarcinoma, or colon cancer. The test sample is 
generally from the esophagus or colon of the patient. In an embodiment, at least two, 
alternatively at least three, alternatively at least five, and alternatively at least eight genes 
selected from the above group is upregulated in cancer tissue at 1.5-fold relative to normal 
tissue. Detection of the up-regulation of these genes is determined by, for example, 

20 hybridization analysis as standard in the and disclosed herein, as well as through antibody 
binding analysis of the level polypeptides expressed by the up-regulated gene or genes. 

In an embodiment, the invention involves treatment by contacting a cancer cell with a 
compound that inhibits expression of at least one, optionally at least two, at least three, at least 

25 five, or at least eight genes, or the polypeptides encoded by the genes, selected from the group 
consisting of CAD17 (liver-intestine cadherin, NM_004063) (SEQ ID NO:45 or 46), CLDN15 
(claudin 15, NM_014343) (SEQ ID NO:47 or 48), SLNAC1 (sodium channel, NM_004769) 
(SEQ ID NO:23 or 24), CFTR (chloride channel, NM_000492) (SEQ ID NO:49 or 50), H2R 
(histamine H2 receptor, NM_022304) (SEQ ID NO:51 or 52), PRSS8 (serine protease, 

30 NM_002773) (SEQ ID NO:7 or 8), PA21 (phospholipase A2 group IB, NM_000928) (SEQ ID 
NO:27 or 28), AGR2 (anterior gradient 2 homolog, (NM 006408) (SEQ ID NO:3 or 4), EGFR 
(NM_005228) (SEQ ID NO:53 or 54), EPHB2 (NM_004442) (SEQ ID NO:55 or 56), 
CRIPTO CR-1 (NM_003212) (SEQ ID NO:57 or 58), Eprin Bl (NM_004429) (SEQ ID 
NO:59 or 60), MMP- 1 7/MT4-MMP (NM_016155) (SEQ ID NO:61 or 62), MMP26 
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(NM_021801) (SEQ ID NO:63 or 64), ADAM10 (NM_001110) (SEQ ID NO:65 or 66), 
ADAMS (NM_001109) (SEQ ID NO:5 or 6), AD AMI (XM_132370) (SEQ ID NO:67 or 68), 
TIM1 (NM_003254) (SEQ ID NO:69 or 70), MUC1 (XM_053256) (SEQ ID NO:71 or 72), 
CEA (NM_004363) (SEQ ID NO:73 or 74), NCA (NM_002483) (SEQ ID NO:75 or 76), 
5 Follistatin (NM_006350) (SEQ ID NO:77 or 78), Claudin 1 (NM_021 101) (SEQ ID NO:79 or 
80), Claudin 14 (NM_012130) (SEQ ID NO:81 or 82), tenascin-R (NM_003285) (SEQ ID 
NO:83 or 84), CAD3 (NM 001793) (SEQ ID NO:85 or 86), AXOl (NM_005076) (SEQ ID 
NO:9 or 10), CONT (NM_001843) (SEQ ID NO:87 or 88), Osteopontin (NM_000582) (SEQ 
ID NO:89 or 90), Galectin 8 (NM_006499) (SEQ ID NO:91 or 92), PGS1 (bihlycan, 

10 NM_001711) (SEQ ID NO:93 or 94), Frizzled 2 (NM_001466) (SEQ ID NO:95 or 96), ISLR 
(NM_005545) (SEQ ID NO:97 or 98), FLJ23399 (NM_022763) (SEQ ID NO:99 or 100), 
TEM1 (NM_020404) (SEQ ID NO: 101 or 102), Tie2 ligand2 (NM_001147) (SEQ ID NO: 103 
or 104), STC-2 (NM_003714) (SEQ ID NO: 19 or 20), VEGFC (NM_005429) (SEQ ID 
NO: 105 or 106), tPA (NM_000930) (SEQ ID NO: 107 or 108), Endothelin 1 (NM_001955) 

15 (SEQ ID NO:l or 2), Thrombomodulin (NM_000361) (SEQ ID NO: 109 or 110), TF 
(NM_001993) (SEQ ID NO:lll or 112), GPR4 (NM 005282) (SEQ ID NO: 113 or 114), 
GPR66 (NM_006056) (SEQ ID NO: 115 or 116), SLC22A2 (NM_003058) ((SEQ ID NO: 117 
or 118), MLSN1 (NM_002420) (SEQ ID NO: 119 or 120), and ATN2 (Na/K transport, 
NM_000702) (SEQ ID NO: 121 or 122). In another embodiment, treatment is by contacting 

20 the cancer cell with a compound that inhibits the production or activity of a polypeptide of the 
above group and/or encoded by a gene of the above group. Where inhibition of a polypeptide 
is desired, the compound is often an antibody specific for the polypeptide, is often a 
monoclonal antibody such as a humanized antibody. 

25 In yet another aspect, the invention involves a method of screening a candidate 

compound for the ability to inhibit cancer cell growth or cause cancer cell death by contacting 
the candidate compound with a cancer cell expressing a gene or polypeptide selected from the 
following group: CAD 17 (liver-intestine cadherin, NM_004063) (SEQ ID NO:45 or 46), 
CLDN15 (claudin 15, NM_014343) (SEQ ID NO:47 or 48), SLNAC1 (sodium channel, 

30 NM_004769) (SEQ ID NO:23 or 24), CFTR (chloride channel, NM_000492) (SEQ ID NO:49 
or 50), H2R (histamine H2 receptor, NM_022304) (SEQ ID NO:51 or 52), PRSS8 (serine 
protease, NM_002773) (SEQ ID NO:7 or 8), PA21 (phospholipase A2 group IB, NM_000928) 
(SEQ ID NO:27 or 28), AGR2 (anterior gradient 2 homolog, (NM_006408) (SEQ ID NO:3 or 
4), EGFR (NM_005228) (SEQ ID NO:53 or 54), EPHB2 (NM_004442) (SEQ ID NO:55 or 

12 
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56), CRIPTO CR-1 (NM_003212) (SEQ ID NO:57 or 58), Eprin Bl (NM_004429) (SEQ ID 
NO:59 or 60), MMP- 1 7/MT4-MMP (NM_016155) (SEQ ID NO:61 or 62), MMP26 
(NM_021801) (SEQ ID NO:63 or 64), ADAM10 (NM_001110) (SEQ ID NO:65 or 66), 
ADAM8 (NM_001109) (SEQ ID NO:5 or 6), AD AMI (XM_132370) (SEQ ID NO:67 or 68), 
5 TIM1 (NM_003254) (SEQ ID NO:69 or 70), MUC1 (XM 053256) (SEQ ID NO:71 or 72), 
CEA (NM_004363) (SEQ ID NO:73 or 74), NCA (NM_002483) (SEQ ID NO:75 or 76), 
Follistatin (NM_006350) (SEQ ID NO:77 or 78), Claudin 1 (NM_021101) (SEQ ID NO:79 or 
80), Claudin 14 (NM_012130) (SEQ ID NO:81 or 82), tenascin-R (NM_003285) (SEQ ID 
NO:83 or 84), CAD3 (NM_001793) (SEQ ID NO:85 or 86), AXOl (NM_005076) (SEQ ID 

10 NO:9 or 10), CONT (NM_001843) (SEQ ID NO:87 or 88), Osteopontin (NM_000582) (SEQ 
ID NO: 89 or 90), Galectin 8 (NM_006499) (SEQ ID NO:91 or 92), PGS1 (bihlycan, 
NM_001711) (SEQ ID NO:93 or 94), Frizzled 2 (NM_001466) (SEQ ID NO:95 or 96), ISLR 
(NM_005545) (SEQ ID NO:97 or 98), FLJ23399 (NM_022763) (SEQ ID NO:99 or 100), 
TEM1 (NM_020404) (SEQ ID NO: 101 or 102), Tie2 ligand2 (NM_001147) (SEQ ID NO: 103 

15 or 104), STC-2 (NM_003714) (SEQ ID NO: 19 or 20), VEGFC (NM_005429) (SEQ ID 
NO: 105 or 106), tPA (NM_000930) (SEQ ID NO: 107 or 108), Endothelin 1 (NM_001955) 
(SEQ ID NO:l or 2), Thrombomodulin (NM_000361) (SEQ ID NO: 109 or 110), TF 
(NM_001993) (SEQ ID NO:lll or 112), GPR4 (NM_005282) (SEQ ID NO: 113 or 114), 
GPR66 (NM_006056) (SEQ ID NO: 115 or 116), SLC22A2 (NM_003058) ((SEQ ID NO: 117 

20 or 118), MLSN1 (NM_002420) (SEQ ID NO: 119 or 120), and ATN2 (Na/K transport, 
NM_000702) (SEQ ID NO: 121 or 122), wherein gene expression of at least one, at least two, 
at least three, at least five, or at least eight genes selected from the group are expressed at a 
level at least about 1.5-fold above the level in normal control tissue. Where the candidate 
compound is an antibody, the antibody is alternatively a polyclonal, monoclonal, humanized 

25 antibody, a Fab, a F(ab')2, or a binding fragment of any one of these compounds. 

In an embodiment, the sequences which are used to determine sequence identity or 
similarity are selected from the sequences described herein. Optionally, sequence variants are 
naturally occurring allelic variants, sequence variants or splice variants of these sequences. 
30 Sequence identity is typically calculated using the BLAST algorithm, described in Altschul et 
al Nucleic Acids Res. 25,3389-3402 (1997) with the BLOSUM62 default matrix. 



In one embodiment, nucleic acid homology can be determined through hybridisation 
studies. Nucleic acids which hybridise under stringent conditions to the nucleic acids of the 
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invention are considered high-grade esophageal dysplasia sequences. Under stringent 
conditions, hybridisation will most preferably occur at 42°C in 750 mM NaCl, 75 mM 
trisodium citrate, 2% SDS, 50% formamide, IX Denhart's, 10% (w/v) dextran sulphate and 
100 pg/ml denatured salmon sperm DNA. Useful variations on these conditions will be readily 
5 apparent to those skilled in the art. The washing steps which follow hybridization most 
preferably occur at 65°C in 15 mM NaCl, 1.5 mM trisodium citrate, and 1% SDS. Additional 
variations on these conditions will be readily apparent to those skilled in the art. 

As a result of the degeneracy of the genetic code, a number of polynucleotide 
10 sequences encoding polypeptides of the invention, some that may have minimal similarity to 
the polynucleotide sequences of any known and naturally occurring gene, may be produced. 
Thus, the invention includes each and every possible variation of polynucleotide sequence that 
could be made by selecting combinations based on possible codon choices. These 
combinations are made in accordance with the standard triplet genetic code as applied to the 
15 polynucleotide sequence of naturally occurring high-grade esophageal dysplasia sequences, 
and all such variations are to be considered as being specifically disclosed. 

The polynucleotides of this invention include RNA, cDNA, genomic DNA, synthetic 
forms, and mixed polymers, both sense and antisense strands, and may be chemically or 

20 biochemically modified, or may contain non-natural or derivatised nucleotide bases as will be 
appreciated by those skilled in the art. Such modifications include labels, methylation, 
intercalators, alkylators and modified linkages. In some instances it may be advantageous to 
produce nucleotide sequences encoding high-grade esophageal dysplasia sequences of the 
invention, or their derivatives, possessing a substantially different codon usage than that of the 

25 naturally occurring gene. For example, codons may be selected to increase the rate of 
expression of the peptide in a particular prokaryotic or eukaryotic host corresponding with the 
frequency that particular codons are utilized by the host. Other reasons to alter the nucleotide 
sequence encoding high-grade esophageal dysplasia sequences of the invention, or their 
derivatives, without altering the encoded amino acid sequences include the production of RNA 

30 transcripts having more desirable properties, such as a greater half-life, than transcripts 
produced from the naturally occurring sequence. 



In some instances, useful nucleic acid sequences up-regulated in high-grade esophageal 
dysplasia of the invention are fragments of larger genes and may be used to identify and obtain 

14 
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corresponding full- length genes. Full-length sequences of the genes selected from the HGD 
gene marker group or cancer gene marker group of the invention can be obtained using a 
partial gene sequence using methods known per se to those skilled in the art. For 
example,"restriction-site PCR" may be used to retrieve unknown sequence adjacent to a 

5 portion of DNA whose sequence is known. In this technique universal primers are used to 
retrieve unknown sequence. Inverse PCR may also be used, in which primers based on the 
known sequence are designed to amplify adjacent unknown sequences. These upstream 
sequences may include promoters and regulatory elements. In addition, various other PCR- 
based techniques may be used, for example a kit available from Clontech (Palo Alto, 

10 California) allows for a walking PCR technique, the 5'RACE kit (Gibco-BRL) allows isolation 
of additional sequence while additional 3 'sequence can be obtained using practised techniques. 

The present invention allows for the preparation of purified high-grade dysplasia 
polypeptide (i.e. a polypeptide encoded by a gene disclosed herein as up-regulated in high- 

15 grade esophageal dysplasia) or protein, from the polynucleotides of the present invention or 
variants thereof. In order to do this, host cells may be transfected with a nucleic acid molecule 
as described above. Typically said host cells are transfected with an expression vector 
comprising a nucleic acid encoding a high-grade esophageal dysplasia protein according to the 
invention. Cells are cultured under the appropriate conditions to induce or cause expression of 

20 the high-grade esophageal dysplasia protein. The conditions appropriate for high-grade 
esophageal dysplasia protein expression will vary with the choice of the expression vector and 
the host cell, and will be easily ascertained by one skilled in the art. 

A variety of expression vector/host systems may be utilized to contain and express the 
25 high-grade dysplasia sequences of the invention and are well known in the art. These include, 
but are not limited to, microorganisms such as bacteria transformed with plasmid or cosmid 
DNA expression vectors; yeast transformed with yeast expression vectors; insect cell systems 
infected with viral expression vectors (e. g., baculovirus); or mouse or other animal or human 
tissue cell systems. In a preferred embodiment the high-grade esophageal dysplasia proteins of 
30 the invention are expressed in mammalian cells using various expression vectors including 
plasmid, cosmid and viral systems such as adenoviral, retroviral or vaccinia virus expression 
systems. The invention is not limited by the host cell employed. 
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The polynucleotide sequences, or variants thereof, of the present invention can be 
stably expressed in cell lines to allow long term production of recombinant proteins in 
mammalian systems. These sequences can be transformed into cell lines using expression 
vectors which may contain viral origins of replication and/or endogenous expression elements 
5 and a selectable marker gene on the same or on a separate vector. The selectable marker 
confers resistance to a selective agent, and its presence allows growth and recovery of cells 
which successfully express the introduced sequences. Resistant clones of stably transformed 
cells may be propagated using tissue culture techniques appropriate to the cell type. 

10 The protein produced by a transformed cell may be secreted or retained intracellularly 

depending on the sequence and/or the vector used. As will be understood by those of skill in 
the art, expression vectors containing polynucleotides which encode a protein of the invention 
may be designed to contain signal sequences which direct secretion of the protein through a 
prokaryotic or eukaryotic cell membrane. 

15 

In addition, a host cell strain may be chosen for its ability to modulate expression of 
the inserted sequences or to process the expressed protein in the desired fashion. Such 
modifications of the polypeptide include, but are not limited to, acetylation, glycosylation, 
phosphorylation, and acylation. Post-translational cleavage of the protein may also be used to 
20 specify protein targeting, folding, and/or activity. Different host cells having specific cellular 
machinery and characteristic mechanisms for post- translational activities (e. g., CHO or HeLa 
cells), are available from the American Type Culture Collection (ATCC) and may be chosen 
to ensure the correct modification and processing of the foreign protein. 

25 When large quantities of protein are needed such as for antibody production, vectors 

which direct high levels of high-grade esophageal dysplasia gene expression may be used such 
as those containing the T5 or T7 inducible bacteriophage promoter. 

The present invention also includes the use of the expression systems described above 
30 in generating and isolating fusion proteins which contain important functional domains of the 
protein. These fusion proteins are used for binding, structural and functional studies as well as 
for the generation of appropriate antibodies. 
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In order to express and purify the protein as a fusion protein, the appropriate cDNA 
sequence is inserted into a vector which contains a nucleotide sequence encoding another 
peptide (for example, glutathionine succinyl transferase). The fusion protein is expressed and 
recovered from prokaryotic or eukaryotic cells. The fusion protein can then be purified by 
5 affinity chromatography based upon the fusion vector sequence. The relevant protein can 
subsequently be obtained by enzymatic cleavage of the fusion protein. 

In one embodiment, a fusion protein may be generated by the fusion of a high-grade 
dysplasia polypeptide with a tag polypeptide which provides an epitope to which an anti-tag 

10 antibody can selectively bind. The epitope tag is generally placed at the amino-or carboxy- 
terminus of the high-grade esophageal dysplasia polypeptide. The presence of such epitope- 
tagged forms of a high-grade esophageal dysplasia polypeptide can be detected using an 
antibody against the tag polypeptide. Also, provision of the epitope tag enables the high-grade 
dysplasia polypeptide to be readily purified by affinity purification using an anti-tag antibody 

15 or another type of affinity matrix that binds to the epitope tag. 

Various tag polypeptides and their respective antibodies are well known in the art. 
Examples include poly-histidine or poly-histidine-glycine tags and the c- myc tag and 
antibodies thereto. Fragments of high-grade dysplasia polypeptide may also be produced by 
20 direct peptide synthesis using solid-phase techniques. Automated synthesis may be achieved 
by using the ABI 433A Peptide Synthesizer (Applied Biosystems, Foster City, CA). Various 
fragments of high-grade dysplasia polypeptide may be synthesized separately and then 
combined to produce the full-length molecule. 

25 In a further aspect of the invention there is provided a method of preparing a 

polypeptide as described above, comprising the steps of: (1) culturing the host cells under 
conditions effective for production of the polypeptide; and (2) harvesting the polypeptide. 

Substantially purified high-grade dysplasia polypeptide or fragments thereof can then 
30 be used in further biochemical analyses to establish secondary and tertiary structure for 
example by x-ray crystallography of the protein or by nuclear magnetic resonance (NMR). 
Determination of structure allows for the rational design of pharmaceuticals to interact with 
the protein, alter protein charge configuration or charge interaction with other proteins, or to 
alter its function in the cell. 
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With the identification of the high-grade esophageal dysplasia marker gene nucleotide 
sequences and the polypeptide sequences encoded by them, probes and antibodies raised to the 
genes can be used in a variety of hybridisation and immunological assays to screen for and 
5 detect the presence of either a normal or mutated gene or gene product. 

In addition the nucleotide and protein sequences of the high-grade dysplasia genes 
provided in this invention enable therapeutic methods for the treatment of cancer, such as 
adenocarcinoma associated with one or more of these genes, enable screening of compounds 
10 for therapeutic intervention, and also enable methods for the diagnosis or prognosis of cancer 
associated with the these genes. Examples of such cancers include, but are not limited to, 
esophageal adenocarcinoma. 

Transducing retroviral vectors are often used for producing a cell line expressing a 
15 gene above the level of expression in a cell lacking the additional copy of the gene. Such a 
cell is useful according to the invention for the production of a cell line useful for screening 
candidate compounds capable of reducing expression of a gene associated with high-grade 
esophageal dysplasia, reducing expression of a polypeptide encoded by the gene, or inhibiting 
activity of the polypeptide, such that the cell does not progress from dysplasia to cancer. The 
20 full-length high-grade dysplasia gene, or portions thereof, can be cloned into a retroviral 
vector and expression can be driven from its endogenous promoter or from the retroviral long 
terminal repeat or from a promoter specific for the target cell type of interest. Other viral 
vectors can be used and include, as is known in the art, adenoviruses, adeno-associated virus, 
vaccinia virus, papovaviruses, lentiviruses and retroviruses of avian, murine and human origin. 

25 

The viral vector described herein above is also useful for gene therapy to reduce the 
activity of the high-grade dysplasia genes of the invention, such as by antisense expression 
inhibition or RNA interference (see, for example, Paddison, P.J. et al., Genes & Development 
16:948-958 (2002) and Brummelkamp, T.R. et al., Science 296:550-553 (2002)). Gene 
30 therapy would be carried out according to established methods (Friedman, 1991; Culver, 
1996). A vector containing a copy of a high-grade esophageal dysplasia gene linked to 
expression control elements and capable of replicating inside the cells is prepared. 
Alternatively the vector may be replication deficient and may require helper cells or helper 
virus for replication and virus production and use in gene therapy. 
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Gene transfer using non-viral methods of infection can also be used. These methods 
include direct injection of DNA, uptake of naked DNA in the presence of calcium phosphate, 
electroporation, protoplast fusion or liposome delivery. Gene transfer can also be achieved by 
5 delivery as a part of a human artificial chromosome or receptor- mediated gene transfer. This 
involves linking the DNA to a targeting molecule that will bind to specific cell- surface 
receptors to induce endocytosis and transfer of the DNA into mammalian cells. One such 
technique uses poly-L-lysine to link asialoglycoprotein to DNA. An adenovirus is also added 
to the complex to disrupt the lysosomes and thus allow the DNA to avoid degradation and 
10 move to the nucleus. Infusion of these particles intravenously has resulted in gene transfer into 
hepatocytes. 

Inhibiting high-grade esophageal dysplasia gene or polypeptide function that are up- 
regulated in cancer can be achieved in a variety of ways as would be appreciated by those 
15 skilled in the art. Typically, a vector expressing the complement of a polynucleotide encoding 
a high-grade dysplasia gene of the invention may be administered to a subject to treat or 
prevent a disorder associated with increased activity and/or expression of the gene including, 
but not limited to, those described above. 

20 Antisense strategies may use a variety of approaches including the use of antisense 

oligonucleotides, ribozymes, DNAzymes, injection of antisense RNA and transfection of 
antisense RNA expression vectors. Many methods for introducing vectors into cells or tissues 
are available and equally suitable for use in vivo, in vitro, and ex vivo. For ex vivo therapy, 
vectors may be introduced into stem cells taken from the patient and clonally propagated for 

25 autologous transplant back into that same patient. Delivery by transfection, by liposome 
injections, or by polycationic amino polymers may be achieved using methods which are well 
known in the art (see, for example, Goldman, CK. et al., Nature Biotechnology 15: 462-466 
(1997)) 

30 Where purified protein or polypeptide is used to produce antibodies which specifically 

bind a high-grade dysplasia protein, the antibody(ies) are used directly as an antagonist or 
indirectly as a targeting or delivery mechanism for bringing a pharmaceutical agent to cells or 
tissues that express the protein. Such antibodies may include, but are not limited to, 
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polyclonal, monoclonal, chimeric and single chain antibodies as would be understood by the 
person skilled in the art. 

For the production of antibodies, various hosts including rabbits, rats, goats, mice, 
5 humans, and others may be immunized by injection with a protein of the invention or with any 
fragment or oligopeptide thereof, which has immunogenic properties. Various adjuvants may 
be used to increase immunological response and include, but are not limited to, Freund's, 
mineral gels such as aluminum hydroxide, and surface-active substances such as lysolecithin. 
Adjuvants used in humans include BCG (bacilli Calmette-Guerin) and Corynebacterium 
10 parvum. 

It is preferred that the oligopeptides, peptides, or fragments used to induce antibodies 
to the high-grade dysplasia of the invention have an amino acid sequence consisting of at least 
about 5 amino acids, and, more preferably, of at least about 10 amino acids. It is also 
15 preferable that these oligopeptides, peptides, or fragments are identical to a portion of the 
amino acid sequence of the natural protein and contain the entire amino acid sequence of a 
small, naturally occurring molecule. Short stretches of amino acids from these proteins may be 
fused with those of another protein, such as KLH, and antibodies to the chimeric molecule 
may be produced. 

20 

Monoclonal antibodies to high-grade dysplasia polypeptides or proteins of the 
invention may be prepared using any technique which provides for the production of antibody 
molecules by continuous cell lines in culture. These include, but are not limited to, the 
hybridoma technique, the human B-cell hybridoma technique, and the EBV-hybridoma 
25 technique. (For example, see Kohler, G. and Milstein, C, Nature 256:495-497 (1975); Kozbor, 
D. et al., Immunol. Methods 81:31-42 (1985); and Cole, S.P. et al., Mol. Cell Biol. 62:109-120 
(1984)). 

Antibodies may also be produced by inducing in vivo production in the lymphocyte 
30 population or by screening immunoglobulin libraries or panels of highly specific binding 
reagents as disclosed in the literature. 

Antibody fragments which contain specific binding sites for the high-grade esophageal 
dysplasia proteins may also be generated. For example, such fragments include fragments 

20 
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produced by pepsin digestion of the antibody molecule and Fab fragments generated by 
reducing the disulfide bridges of the F(AB)2 fragments. Alternatively, Fab expression libraries 
may be constructed to allow rapid and easy identification of monoclonal Fab fragments with 
the desired specificity. (For example, see Huse, W. D. et al., Science 246:1275-1281 (1989)). 
5 Various immunoassays well known in art may be used for screening to identify antibodies 
having the desired specificity. 

Numerous protocols for competitive binding or immunoradiometric assays using either 
polyclonal or monoclonal antibodies with established specificities are well known in the art. 
10 Such immunoassays typically involve the measurement of complex formation between a 
protein and its specific antibody. A two-site, monoclonal-based immunoassay utilizing 
monoclonal antibodies reactive to two non-interfering epitopes is preferred, but a competitive 
binding assay may also be employed. 

15 Candidate pharmaceutical agents or compounds encompass numerous chemical 

classes, though typically they are organic molecules, preferably small organic compounds 
having molecular weight of more than 100 and less than about 2,500 daltons. Candidate agents 
are also found among biomolecules including peptides, saccharides, fatty acids and steroids 
and peptides. 

20 

Agent screening techniques include, but are not limited to, utilising eukaryotic or 
prokaryotic host cells that are stably transformed with recombinant molecules expressing a 
particular high-grade dysplasia polypeptide of the invention, or fragment thereof, preferably in 
competitive binding assays. Binding assays will measure for the formation of complexes 
25 between the high-grade esophageal dysplasia polypeptide, or fragments thereof, and the agent 
being tested, or will measure the degree to which an agent being tested will interfere with the 
formation of a complex between the high-grade esophageal dysplasia polypeptide, or fragment 
thereof, and a known ligand. 

30 Another technique for drug screening provides high- throughput screening for 

compounds having suitable binding affinity to a high-grade dysplasia polypeptide. In such a 
technique, large numbers of small peptide test compounds are synthesised on a solid substrate 
and can be assayed through high-grade esophageal dysplasia polypeptide binding and 
washing. Bound high-grade dysplasia polypeptide is then detected by methods well known in 
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the art. In a variation of this technique, purified polypeptides can be coated directly onto plates 
to identify interacting test compounds. 

An additional method for drug screening involves the use of host eukaryotic cell lines 
5 which carry mutations in a particular high-grade dysplasia gene. The host cell lines are also 
defective at the polypeptide level. Other cell lines may be used where the gene expression of 
the high-grade esophageal dysplasia gene can be switched off or up-regulated. The host cell 
lines or cells are grown in the presence of various drug compounds and the rate of growth of 
the host cells is measured to determine if the compound is capable of regulating the growth of 
10 defective cells. 

A high-grade esophageal dysplasia polypeptide encoded by an HGD marker gene may 
also be used for screening compounds developed as a result of combinatorial library 
technology. This provides a way to test a large number of different substances for their ability 
15 to modulate activity of a polypeptide. The use of peptide libraries is preferred with such 
libraries and their use known in the art. 

A substance identified as a modulator of polypeptide function may be peptide or non- 
peptide in nature. Non-peptide "small molecules" are often preferred for many in vivo 

20 pharmaceutical applications. In addition, a mimic or mimetic of the substance may be 
designed for pharmaceutical use. The design of mimetics based on a known pharmaceutieally 
active compound (i.e., a "lead compound") is a common approach to the development of novel 
pharmaceuticals. This is often desirable where the original active compound is difficult or 
expensive to synthesise or where it provides an unsuitable method of administration. In the 

25 design of a mimetic, particular parts of the original active compound that are important in 
determining the target property are identified. These parts or residues constituting the active 
region of the compound are known as its pharmacophore. Once found, the pharmacophore 
structure is modelled according to its physical properties using data from a range of sources 
including x-ray diffraction data and NMR. A template molecule is then selected onto which 

30 chemical groups which mimic the pharmacophore can be added. The selection can be made 
such that the mimetic is easy to synthesise, is likely to be pharmacologically acceptable, does 
not degrade in vivo and retains the biological activity of the lead compound. Further 
optimisation or modification can be carried out to select one or more final mimetics useful for 
in vivo or clinical testing. 

22 
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It is also possible to isolate a target-specific antibody and then solve its crystal 
structure. In principle, this approach yields a pharmacophore upon which subsequent drag 
design can be based as described above. It may be possible to avoid protein crystallography 
5 altogether by generating anti-idiotypic antibodies (anti-ids) to a functional, pharmacologically 
active antibody. 

As a mirror image of a mirror image, the binding site of the anti-ids would be expected 
to be an analogue of the original binding site. The anti-id could then be used to isolate peptides 
10 from chemically or biologically produced peptide banks. 

In further embodiments, any of the genes, proteins, antagonists, antibodies, 
complementary sequences, or vectors of the invention may be administered in combination 
with other appropriate therapeutic agents. 

15 

Selection of the appropriate agents may be made by those skilled in the art, according 
to conventional pharmaceutical principles. The combination of therapeutic agents may act 
synergistically to effect the treatment or prevention of the various disorders described above. 
Using this approach, therapeutic efficacy with lower dosages of each agent may be possible, 
20 thus reducing the potential for adverse side effects. 

In a further aspect a pharmaceutical composition and a pharmaceutically acceptable 
carrier may be administered to a patient diagnosed as experiencing high-grade esophageal 
dysplasia for the inhibition or prevention of progression of the disease to adenocarcinoma. 

25 

The pharmaceutical composition may comprise any one or more of a polypeptide as 
described above, typically a substantially purified high-grade esophageal dysplasia 
polypeptide, an antibody to a high-grade esophageal dysplasia polypeptide, a vector capable of 
expressing a high-grade esophageal dysplasia polypeptide, a compound which increases or 
30 decreases expression of a high-grade esophageal dysplasia gene, a candidate drug that restores 
wild-type activity to a high-grade esophageal dysplasia gene or an antagonist of a high-grade 
esophageal dysplasia gene. 
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The pharmaceutical composition may be administered to a subject to treat or prevent a 
cancer associated with decreased activity and/or expression of a high-grade esophageal 
dysplasia gene including, but not limited to, those provided above. 

5 Pharmaceutical compositions in accordance with the present invention are prepared by mixing 
a polypeptide of the invention, or active fragments or variants thereof, having the desired 
degree of purity, with acceptable carriers, excipients, or stabilizers which are well known. 

Acceptable carriers, excipients or stabilizers are nontoxic at the dosages and 
10 concentrations employed, and include buffers such as phosphate, citrate, and other organic 
acids; antioxidants including absorbic acid; low molecular weight (less than about 10 residues) 
polypeptides; proteins, such as serum albumin, gelatin, or immunoglobulins; hydrophilic 
polymers such as polyvinylpyrrolidone; amino acids such as glycine, glutamine, asparagine, 
arginine or lysine; monosaccharides, disaccharides, and other carbohydrates including glucose, 
15 mannose, or dextrins; chelating agents such as EDTA; sugar alcohols such as mannitrol or 
sorbitol; salt-forming counterions such as sodium; and/or nonionic surfactants such as Tween, 
Pluronics or polyethylene glycol (PEG). 

Any of the therapeutic methods described above may be applied to any subject in need 
20 of such therapy, including, for example, mammals such as dogs, cats, cows, horses, rabbits, 
monkeys, and most preferably, humans. 

Polynucleotide sequences encoding the high-grade esophageal dysplasia genes of the 
invention may be used for the diagnosis or prognosis of cancers associated with their 
25 dysfunction, or a predisposition to such cancers. Examples of such cancers include, but are not 
limited to, adenocarcinoma, such as in patients having Barrett's esophagus. Diagnosis or 
prognosis may be used to determine the severity, type or stage of the disease state in order to 
initiate an appropriate therapeutic intervention. 

30 In another embodiment of the invention, the polynucleotides that may be used for 

diagnostic or prognostic purposes include oligonucleotide sequences, genomic DNA and 
complementary RNA and DNA molecules. The polynucleotides may be used to detect and 
quantitate gene expression in biopsied tissues in which mutations or abnormal expression of 
the relevant high-grade esophageal dysplasia gene may be correlated with disease. Genomic 
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DNA used for the diagnosis or prognosis may be obtained from body cells, such as those 
present in the blood, tissue biopsy, surgical specimen, or autopsy material. The DNA may be 
isolated and used directly for detection of a specific sequence or may be amplified by the 
polymerase chain reaction (PGR) prior to analysis. Similarly, RNA or cDNA may also be 
5 used, with or without PCR amplification. To detect a specific nucleic acid sequence, direct 
nucleotide sequencing, reverse transcriptase PCR (RT-PCR), hybridization using specific 
oligonucleotides, restriction enzyme digest and mapping, PCR mapping, RNAse protection, 
and various other methods may be employed. 

10 Oligonucleotides specific to particular sequences can be chemically synthesized and 

labelled radioactively or non- radioactively and hybridised to individual samples immobilized 
on membranes or other solid-supports or in solution. The presence, absence or excess 
expression of a particular high-grade esophageal dysplasia gene may then be visualized using 
methods such as autoradiography, fluorometry, or colorimetry. 

15 

In a particular aspect, the nucleotide sequences encoding a high-grade esophageal 
dysplasia gene of the invention may be useful in assays that detect the presence of associated 
disorders, particularly those mentioned previously. The nucleotide sequences encoding the 
relevant high-grade esophageal dysplasia gene may be labelled by standard methods and 
20 added to a fluid or tissue sample from a patient under conditions suitable for the formation of 
hybridization complexes. 

After a suitable incubation period, the sample is washed and the signal is quantitated 
and compared with a standard value. If the amount of signal in the patient sample is 
25 significantly altered in comparison to a control sample then the presence of altered levels of 
nucleotide sequences encoding the high-grade esophageal dysplasia gene in the sample 
indicates the presence of the associated disorder. Such assays may also be used to evaluate the 
efficacy of a particular therapeutic treatment regimen in animal studies, in clinical trials, or to 
monitor the treatment of an individual patient. 

30 

In order to provide a basis for the diagnosis or prognosis of a disorder associated with a 
mutation in a particular high-grade esophageal dysplasia gene of the invention, the nucleotide 
sequence of the relevant gene can be compared between normal tissue and diseased tissue in 
order to establish whether the patient expresses a mutant gene. 
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In order to provide a basis for the diagnosis or prognosis of a disorder associated with 
abnormal expression of a particular high-grade esophageal dysplasia gene of the invention, a 
normal or standard profile for expression is established. This may be accomplished by 
5 combining body fluids or cell extracts taken from normal subjects, either animal or human, 
with a sequence, or a fragment thereof, encoding the relevant high-grade esophageal dysplasia 
gene, under conditions suitable for hybridization or amplification. Standard hybridization may 
be quantified by comparing the values obtained from normal subjects with values from an 
experiment in which a known amount of a substantially purified polynucleotide is used. 

10 

Another method to identify a normal or standard profile for expression of a particular 
high-grade esophageal dysplasia gene is through quantitative RT-PCR studies. RNA isolated 
from body cells of a normal individual, particularly RNA isolated from tumour cells, is reverse 
transcribed and real-time PGR using oligonucleotides specific for the relevant high-grade 
15 esophageal dysplasia gene is conducted to establish a normal level of expression of the gene. 

Standard values obtained in both these examples may be compared with values 
obtained from samples from patients who are symptomatic for a disorder. Deviation from 
standard values is used to establish the presence of a disorder. 

20 

Once the presence of a disorder is established and a treatment protocol is initiated, 
hybridization assays or quantitative RT-PCR studies may be repeated on a regular basis to 
determine if the level of expression in the patient begins to approximate that which is observed 
in the normal subject. The results obtained from successive assays may be used to show the 
25 efficacy of treatment over a period ranging from several days to months. 

In one aspect, hybridization with PGR probes which are capable of detecting 
polynucleotide sequences, including genomic sequences, encoding a particular high-grade 
esophageal dysplasia gene, or closely related molecules, may be used to identify nucleic acid 
30 sequences which encode the gene. The specificity of the probe, whether it is made from a 
highly specific region, e. g., the 5'regulatory region, or from a less specific region, e.g., a 
conserved motif, and the stringency of the hybridization or amplification will determine 
whether the probe identifies only naturally occurring sequences encoding the high-grade 
esophageal dysplasia gene, allelic variants, or related sequences. 

26 
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Probes may also be used for the detection of related sequences, and should preferably 
have at least 50% sequence identity to any of the high-grade esophageal dysplasia encoding 
sequences. The hybridization probes of the subject invention may be DNA or RNA and may 
5 be derived from the sequence of HGD marker genes disclosed in Table 4 or from genomic 
sequences including promoters, enhancers, and introns of the genes. 

Means for producing specific hybridization probes for DNAs encoding the high-grade 
esophageal dysplasia genes of the invention include the cloning of polynucleotide sequences 
10 encoding these genes or their derivatives into vectors for the production of mRNA probes. 
Such vectors are known in the art, and are commercially available. Hybridization probes may 
be labelled by radionuclides such as 32p or 35S, or by enzymatic labels, such as alkaline 
phosphatase coupled to the probe via avidin/biotin coupling systems, or other methods known 
in the art. 

15 

According to a further aspect of the invention there is provided the use of a polypeptide 
as described above in the diagnosis or prognosis of a cancer associated with a high-grade 
esophageal dysplasia gene of the invention, or a predisposition to such cancers. 

20 When a diagnostic or prognostic assay is to be based upon a high-grade esophageal 

dysplasia protein, a variety of approaches are possible. For example, diagnosis or prognosis 
can be achieved by monitoring differences in the electrophoretic mobility of normal and 
mutant proteins. Such an approach will be particularly useful in identifying mutants in which 
charge substitutions are present, or in which insertions, deletions or substitutions have resulted 

25 in a significant change in the electrophoretic migration of the resultant protein. Alternatively, 
diagnosis may be based upon differences in the proteolytic cleavage patterns of normal and 
mutant proteins, differences in molar ratios of the various amino acid residues, or by 
functional assays demonstrating altered function of the gene products. 

30 In another aspect, antibodies that specifically bind a high-grade esophageal dysplasia 

gene of the invention may be used for the diagnosis or prognosis of cancers characterized by 
abnormal expression of the gene, or in assays to monitor patients being treated with the gene 
or agonists, antagonists, or inhibitors of the gene. Antibodies useful for diagnostic purposes 
may be prepared in the same manner as described above for therapeutics. Diagnostic or 
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prognostic assays include methods that utilize the antibody and a label to detect a high-grade 
esophageal dysplasia gene of the invention in human body fluids or in extracts of cells or 
tissues. The antibodies may be used with or without modification, and may be labelled by 
covalent or non- covalent attachment of a reporter molecule. 

5 

A variety of protocols for measuring a high-grade esophageal dysplasia gene of the 
invention, including ELISA, RIAs, and FACS, are known in the art and provide a basis for 
diagnosing altered or abnormal levels of their expression. Normal or standard values for their 
expression are established by combining body fluids or cell extracts taken from normal 

10 mammalian subjects, preferably human, with antibody to the high-grade esophageal dysplasia 
protein under conditions suitable for complex formation. The amount of standard complex 
formation may be quantitated by various methods, preferably by photometric means. 
Quantities of any of the high-grade esophageal dysplasia genes expressed in subject, control, 
and disease samples from biopsied tissues are compared with the standard values. Deviation 

15 between standard and subject values establishes the parameters for diagnosing disease. 

Once an individual has been diagnosed with a cancer, effective treatments can be 
initiated. These may include administering a selective agonist to the relevant mutant high- 
grade esophageal dysplasia gene so as to restore its function to a normal level or introduction 

20 of the wild-type gene, particularly through gene therapy approaches as described above. 
Typically, a vector capable of expressing the appropriate full-length high-grade esophageal 
dysplasia gene or a fragment or derivative thereof may be administered. In an alternative 
approach to therapy, a substantially purified high-grade esophageal dysplasia polypeptide and 
a pharmaceutically acceptable carrier may be administered, as described above, or drugs 

25 which can replace the function of or mimic the action of the relevant high-grade esophageal 
dysplasia gene may be administered. 

In the treatment of cancers associated with increased high-grade esophageal dysplasia 
gene expression and/or activity, the affected individual may be treated with a selective 
30 antagonist such as an antibody to the relevant protein or an antisense (complement) probe to 
the corresponding gene as described above, or through the use of drugs which may block the 
action of the relevant high-grade esophageal dysplasia gene. 
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In further embodiments, complete cDNAs, oligonucleotides or longer fragments 
derived from any of the polynucleotide sequences described herein may be used as targets in a 
microarray. The microarray can be used to monitor the expression level of large numbers of 
genes simultaneously and to identify genetic variants, mutations, and polymorphisms. This 

5 information may be used to determine gene function, to understand the genetic basis of a 
disorder, to detect or prognose a disorder, and to develop and monitor the activities of 
therapeutic agents. Microarrays may be prepared, used, and analyzed using methods known in 
the art (for example, see Schena, M. et al. PNAS USA 93:10614-10619 (1996); Heller, R.A. et 
al., PNAS USA 94:2150-2155 (1997); and Heller, M.J., Annual Review of Biomedical 

10 Engineering 4: 129-53 (2002)). 

The present invention also provides for the production of genetically modified (knock- 
out, knock-down, knock-in and transgenic), non-human animal models transformed with the 
DNA molecules of the invention. These animals are useful for the study of high-grade 
15 esophageal dysplasia gene function, to study the mechanisms of cancer as related to the high- 
grade esophageal dysplasia genes, for the screening of candidate pharmaceutical compounds, 
for the creation of explanted mammalian cell cultures which express the protein or mutant 
protein and for the evaluation of potential therapeutic interventions. 

20 One of the high-grade esophageal dysplasia genes of the invention may have been 

inactivated by knock-out deletion, and knock-out genetically modified non-human animals are 
therefore provided. 

Animal species which are suitable for use in the animal models of the present invention 
25 include, but are not limited to, rats, mice, hamsters, guinea pigs, rabbits, dogs, cats, goats, 
sheep, pigs, and non-human primates such as monkeys and chimpanzees. For initial studies, 
genetically modified mice and rats are highly desirable due to their relative ease of 
maintenance and shorter life spans. For certain studies, transgenic yeast or invertebrates may 
be suitable and preferred because they allow for rapid screening and provide for much easier 
30 handling. For longer term studies, non-human primates may be desired due to their similarity 
with humans. 



To create an animal model for a mutated high-grade esophageal dysplasia gene of the 
invention several methods can be employed. These include generation of a specific mutation 
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in a homologous animal gene, insertion of a wild type human gene and/or a humanized animal 
gene by homologous recombination, insertion of a mutant (single or multiple) human gene as 
genomic or minigene cDNA constructs using wild type or mutant or artificial promoter 
elements or insertion of artificially modified fragments of the endogenous gene by 
homologous recombination. The modifications include insertion of mutant stop codons, the 
deletion of DNA sequences, or the inclusion of recombination elements (lox p sites) 
recognized by enzymes such as Cre recombinase. 

To create a transgenic mouse, which is preferred, a mutant version of a particular high- 
grade esophageal dysplasia gene of the invention can be inserted into a mouse germ line using 
standard techniques of oocyte microinjection or transfection or microinjection into embryonic 
stem cells. Alternatively, if it is desired to inactivate or replace the endogenous high-grade 
esophageal dysplasia gene, homologous recombination using embryonic stem cells may be 
applied. For oocyte injection, one or more copies of the mutant or wild type high-grade 
esophageal dysplasia gene can be inserted into the pronucleus of a just-fertilized mouse 
oocyte. This oocyte is then reimplanted into a pseudo-pregnant foster mother. The liveborn 
mice can then be screened for integrants using analysis of tail DNA for the presence of human 
high-grade esophageal dysplasia gene sequences. The transgene can be either a complete 
genomic sequence injected as a YAC, BAC, PAC or other chromosome DNA fragment, a 
cDNA with either the natural promoter or a heterologous promoter, or a minigene containing 
all of the coding region and other elements found to be necessary for optimum expression. 
The genetically modified non-human animals as described above are useful for the screening 
of candidate pharmaceutical compounds. 

BRIEF DESCRIPTION OF THE DRAWINGS 

Figures 1A and IB are graphs showing a distribution of expression of IL-1H1 (Fig. 
1A) and CYP2J2 (Fig. IB) in the dysplasia-carcinoma sequence in BE. Expression in normal 
epithelium and in esophageal epithelia from samples of Barrett's esophagus (BE), dysplasia 
(D), BE adjacent to andenocarcinoma (BE-CA); and adenocarcinoma (CA) are plotted. The 
vertical line denotes the average Z score in each disease group. Normal refers to the normal 
esophagus group. Dysplasia includes low- and high-grade dysplasia samples. 



30 



WO 2004/044178 



PCT/US2003/036260 



Figures 2A and 2B are graphs showing a distribution of expression of AGR2 (Fig. 2A) 
and NROB2 (Fig. 2B) in the dysplasia-carcinoma sequence in BE. Expression in esophageal 
epithelia from samples of Barrett's esophagus (BE), dysplasia (D), BE adjacent to 
andenocarcinoma (BE-CA); and adenocarcinoma (CA) are plotted. The vertical line denotes 
5 the average Z score in each disease group. Normal refers to pooled epithelia samples. 
Dysplasia includes low- and high-grade dysplasia samples. 

Figures 3A and 3B are graphs showing a distribution of expression of TCF4 (Fig. 3A) and FLJ23399 (Fig. 3B) in 
the dysplasia-carcinoma sequence in BE. Expression in esophageal epithelia from samples of Barrett's 
10 esophagus (BE), dysplasia (D), BE adjacent to andenocarcinoma (BE-CA); and adenocarcinoma (CA) are 
plotted. The vertical line denotes the average Z score in each disease group. Normal refers to pooled epithelia 
samples. Dysplasia includes low- and high-grade dysplasia samples. 

Figures 4A and 4B show the nucleic acid sequence (SEQ ID NO:l) and the amino 
15 acid sequence (SEQ ID NO:2) of ET-1 (endothelin-1, NM„001955). 

Figures 5 A and 5B show the nucleic acid sequence (SEQ ID NO: 3) and the amino acid 
sequence (SEQ ID NO:4) of AGR2 (anterior gradient 2 (Xenepus laevis) homolog, 
NM_006408). 

20 

Figures 6 A and 6B show the nucleic acid sequence (SEQ ID NO: 5) and the amino acid 
sequence (SEQ ID NO:6) of ADAM8 (NMJ301109). 

Figures 7A and 7B show the nucleic acid sequence (SEQ ID NO:7) and the amino acid 
25 sequence (SEQ ID NO: 8) of PSS8 (Prostasin precursor, serine protease, NM_002773). 

Figures 8A-8C show the nucleic acid sequence (SEQ ID NO:9) and Figure 8D shows 
the amino acid sequence (SEQ ID NO: 10) of AXOl (Axonin-1 precursor, NM_005076). 

30 Figures 9 A and 9B show the nucleic acid sequence (SEQ ID NO: 11) and the amino 

acid sequence (SEQ ID NO: 12) of NROB2 (Nuclear hormone receptor, NM_021969). 

Figures 10A and 10B show the nucleic acid sequence (SEQ ID NO: 13) and the amino 
acid sequence (SEQ ID NO: 14) of TM7SF1 (NM_003272). 

35 
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Figures 11A and 11B show the nucleic acid sequence (SEQ ID NO: 15) and the amino 
acid sequence (SEQ ID NO: 16) of DLDH (dihydrolipamide dehydrogenase, NM_000108). 

Figures 12A and 12B show the nucleic acid sequence (SEQ ID NO: 17) and the amino 
5 acid sequence (SEQ ID NO: 18) of MAT2B (methionine adenosyltransferase II, beta, 
NMJM3283). 

Figures 13 A and 13B show the nucleic acid sequence (SEQ ID NO: 19) and the amino 
acid sequence (SEQ ID NO:20) of STC-2 (stanniocalcin-2, NM_003714). 

10 

Figures 14A and 14B show the nucleic acid sequence (SEQ ID NO:21) and the amino 
acid sequence (SEQ ID NO:22) of PPBI (alkaline phosphatase, intestinal precursor, 
NM_001631). 

15 Figures 15A and 15B show the nucleic acid sequence (SEQ ID NO:23) and the amino 

acid sequence (SEQ ID NO:24) of SLNAC1 (sodium channel receptor SLNAC1, 
NMJ)04769). 

Figures 16A and 16B show the nucleic acid sequence (SEQ ID NO: 25) and the amino 
20 acid sequence (SEQ ID NO:26) of CAH4 (carbonic anhydrase iv precursor, NM_000717). 

Figures 17 A and 17B show shows the nucleic acid sequence (SEQ ID NO: 27) and the 
amino acid sequence (SEQ ID NO:28) of PA21 (phopholipase a2 precursor, NMJ300928). 

25 Figures 18A and 18B show the nucleic acid sequence (SEQ ID NO: 29) and the amino 

acid sequence (SEQ ID NO:30) of PAR2 (proteinase activated receptor 2 precursor, 
NM_005242). 

Figures 19A and 19B show the nucleic acid sequence (SEQ ID NO:31) and the amin 
30 acid sequence (SEQ ID NO:32) of IDE (insulin-degrading enzyme, NM_004969). 

Figures 20A-20B show the nucleic acid sequence (SEQ ID NO:33) and Figure 20C 
shows the amino acid sequence (SEQ ID NO:34) of MYOIA (myosin-lA, NMJ305379). 
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Figures 21A and 21B the nucleic acid sequence (SEQ ID NO:35) and the amin acid 
sequence (SEQ ID NO:36) of CYP2J2 (cytochrome P450 monooxygenase, NMJ300775). 

Figures 22A and 22B show the nucleic acid sequence (SEQ ID NO:37) and the amin 
5 acid sequence (SEQ ID NO:38) of PHYH (phytanoyl-CoA-hydroxylase (Refsum disease), 
NM_006214). 

Figures 23A and 23B show the nucleic acid sequence (SEQ ID NO:39) and the amin 
acid sequence (SEQ ID NO:40) of CYB5 (cytochrome b5, 3' end, NM_001914). 

10 

Figures 24A and 24B show the nucleic acid sequence (SEQ ID NO:41) and the amin 
acid sequence (SEQ ID NO:42) of COXVIb (coxVIb gene, last exon and flanking sequence, 
NM_001863). 

15 Figures 25A and 25B show the nucleic acid sequence (SEQ ID NO:43) and the amin 

acid sequence (SEQ ID NO:44) of TCF4 (NMJ330756). 

Figures 26A-26B show the nucleic acid sequence (SEQ ID NO:45) and Figure 26C 
shows the amino acid sequence (SEQ ID NO:46) of CAD17 (liver-intestine cadherin, , 
20 NM_004063). 

Figures 27A and 27B show the nucleic acid sequence (SEQ ID NO:47) and the amino 
acid sequence (SEQ ID NO:48) of CLDN15 (claudin 15, NM__014343). 

25 Figures 28A-28B show the nucleic acid sequence (SEQ ID NO:49) and Figure 28C 

shows the amino acid sequence (SEQ ID NO:50) of CFTR (chloride channel, NMJ300492). 

Figures 29A and 29B show the nucleic acid sequence (SEQ ID NO:51) and the amino 
acid sequence (SEQ ID NO:52) of H2R (histamine H2 receptor, NM_022304). 

30 

Figures 30A-30B show the nucleic acid sequence (SEQ ID NO:53) and Figure 30C 
shows the amino acid sequence (SEQ ID NO:54) of EGFR (epidermal growth factor receptor, 
NM_005228). 
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Figures 31A-31B show the nucleic acid sequence (SEQ ID NO:55) and Figure 31C 
shows the amino acid sequence (SEQ ID NO:56) of EPHB2, NM_004442). 

Figures 32A and 32B show the nucleic acid sequence (SEQ ID NO:57) and the amino 
5 acid sequence (SEQ ID NO:58) of CRIPTO CR-1 (NM_003212). 

Figures 33A and 33B show the nucleic acid sequence (SEQ ID NO:59) and the amino 
acid sequence (SEQ ID NO:60) of Eprin Bl (NM_004429). 

10 Figures 34A and 34B show the nucleic acid sequence (SEQ ID NO:61) and the amino 

acid sequence (SEQ ID NO:62) of MMP- 1 7/MT4-MMP (matrix metalloproteinase 17, 
NM.016155). 

Figures 35A and 35B show the the nucleic acid sequence (SEQ ID NO:63) and the 
15 amino acid sequence (SEQ ID NO: 64) of MMP26 (matrix metalloproteinase 26, 
NM_021801). 

Figures 36 A and 36B show the nucleic acid sequence (SEQ ID NO: 65) and the amino 
acid sequence (SEQ ID NO:66) of ADAM10 (NM_001 1 10). 

20 

Figures 37A and 37B show the nucleic acid sequence (SEQ ID NO:67) and the amino 
acid sequence (SEQ ID NO:68) of AD AMI (XM_1 32370). 

Figures 38A and 38B show the nucleic acid sequence (SEQ ID NO:69) and the amino 
25 acid sequence (SEQ ID NO:70) of TIM1 (NM_003254). 

Figures 39 A and 39B show the nucleic acid sequence (SEQ ID NO:71) and the amino 
acid sequence (SEQ ID NO:72) of MUC1 (XM_053256). 

30 Figures 40A and 40B show the nucleic acid sequence (SEQ ID NO:73) and the amino 

acid sequence (SEQ ID NO:74) of CEA (NM_004363). 

Figures 41A and 41B show the nucleic acid sequence (SEQ ID NO:75) and the amino 
acid sequence (SEQ ID NO:76) of NCA (NM_002483). 
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Figures 42 A and 42B show the nucleic acid sequence (SEQ ID NO: 77) and the amino 
acid sequence (SEQ ID NO:78) of Follistatin (NM__006350). 

5 Figures 43A and 43B show the nucleic acid sequence (SEQ ID NO:79) and the amino 

acid sequence (SEQ ID NO:80) of Claudin 1 (NM_021 101). 

Figures 44A and 44B show the nucleic acid sequence (SEQ ID NO:81) and the amino 
acid sequence (SEQ ID NO:82) of Claudin 14 (NM_012130). 

10 

Figures 45A-45B show the nucleic acid sequence (SEQ ID NO:83) and Figure 45C 
show the amino acid sequence (SEQ ID NO: 84) of Tenascin-R (NM-003285). 

Figures 46A and 46B show the nucleic acid sequence (SEQ ID NO: 85) and the amino 
15 acid sequence (SEQ ID NO:86) of CAD3 (NM_001793). 

Figures 47 A and 47B show the nucleic acid sequence (SEQ ID NO: 87) and the amino 
acid sequence (SEQ ID NO:88) of CONT (NM„001843). 

20 Figures 48 A and 48B show the nucleic acid sequence (SEQ ID NO: 89) and the amino 

acid sequence (SEQ ID NO:90) of Osteopontin (NM_000582). 

Figures 49 A and 49B show the nucleic acid sequence (SEQ ID NO:91) and the amino 
acid sequence (SEQ ID NO:92) of Galectin 8 (NM_006499). 

25 

Figures 50A and 50B show the nucleic acid sequence (SEQ ID NO:93) and the amino 
acid sequence (SEQ ID NO:94) of GS1 (bihlycan, NM_001711). 

Figures 51 A and 5 IB show the nucleic acid sequence (SEQ ID NO:95) and the amino 
30 acid sequence (SEQ ID NO:96) of Fizzled 2 (NM001466). 

Figures 52A and 52B show the nucleic acid sequence (SEQ ID NO:97) and the amino 
acid sequence (SEQ ID NO:98) of ISLR (NMJD05545). 
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Figures 53A-53B show the nucleic acid sequence (SEQ ID NO:) and Figure 53C 
shows the amino acid sequence (SEQ ID NO:2) of 

Figures 54A and 54B show the nucleic acid sequence (SEQ ID NO:l) and the amino 
5 acid sequence (SEQ ID NO:2) of 

Figures 55 A and 55B show the nucleic acid sequence (SEQ ID NO: 103) and the amino 
acid sequence (SEQ ID NO: 104) of Tie2 ligand2 (NM__001147). 

10 Figures 56A and 56B show the nucleic acid sequence (SEQ ID NO: 105) and the amino 

acid sequence (SEQ ID NO: 106) of VEGFC (NM_005429). 

Figures 57A and 57B show the nucleic acid sequence (SEQ ID NO: 107) and the amino 
acid sequence (SEQ ID NO: 108) of tPA (NM_000930). 

15 

Figures 58A-58B show the nucleic acid sequence (SEQ ID NO: 109) and Figure 58C 
shows the amino acid sequence (SEQ ID NO: 110) of thrombomodulin (NM„000361). 

Figures 59 A and 59B show the nucleic acid sequence (SEQ ID NO: 111) and the amino 
20 acid sequence (SEQ ID NO: 112) of TF (coagulation factor III, thromboplastin, tissue factor, 
NM_0001993). 

Figures 60A and 60B show the nucleic acid sequence (SEQ ID NO: 113) and the amino 
acid sequence (SEQ ID NO: 114) of GPR4 (G-coupled protein receptor-4, NM_005282). 

25 

Figures 61 A and 61B show the nucleic acid sequence (SEQ ID NO: 115) and the amino 
acid sequence (SEQ ID NO:l 16) of GPR66 (G-coupled protein receptor 66). 

Figures 62 A and 62B show the nucleic acid sequence (SEQ ID NO: 117) and the amino 
30 acid sequence (SEQ ID NO: 1 1 8) of SLC22A2 (NM_003058). 

Figures 63A-63B show the nucleic acid sequence (SEQ ID NO: 119) and Figure 63C 
shows the amino acid sequence (SEQ ID NO: 120) of MLSN1 (NM_002420). 
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Figures 64A-64B show the nucleic acid sequence (SEQ ID NO: 121) and Figure 64C 
shows the amino acid sequence (SEQ ID NO: 122) of ATN2 (Na/K transport, NM_000702). 

DESCRIPTION OF THE INVENTION 

5 

Barrett's esophagus, a complication of gastrointestinal reflux disease, is the primary 
risk factor for esophageal adenocarcinoma. Biopsy specimens representing disease progression 
through Barrett's esophagus, dysplasia and adenocarcinoma, were collected and analyzed 
using cDNA microarrays to identify genes expressed in the different disease stages. It was 
10 discovered that the expression of particular genes increased with the progression of the disease 
through dysplasia, especially high grade dysplasia, suggestive of a differentiated small 
intestinal enterocyte lineage. The present invention defines a collection of markers that assist 
in identifying patients with highest risk of developing cancer, especially the development of 
esophageal adenocarcinoma. 

15 

The progression of Barrett's esophagus through dysplasia to adenocarcinoma was 
examined, identifying specific genes associated with increasing risk of carcinogenesis. These 
data provide insight into the potential role of progressive intestinal metaplasia in generating 
the colon tumor-like expression profiles disclosed herein for esophageal adenocarcinoma. 
20 Genes that define early stages of this process, progression of BE to dysplasia, serve as markers 
to permit targeting of surveillance to those patients at most risk of developing esophageal 
carcinoma. 

DNA microarray technology has been used to characterize and cluster Barrett's 
25 metaplasia from normal mucosa, and esophageal adenocarcinoma and squamous cell 
carcinoma (Barrett et al., Neoplasia 4:121-128 (2002); and Selaru et al., Oncogene 21:475-478 
(2002)). The authors do not, however, describe HGD markers or dysplasia markers of any 
kind useful for predicting patients likely to develop adenocarcinoma. 

30 The present invention provides nucleic acid and protein sequences that are 

differentially expressed in high-grade esophageal dysplasia when compared to normal tissue 
controls, here-in termed "high-grade dysplasia genes," "high-grade dysplasia nucleic acid 
sequences," "HGD marker genes" and the like. As outlined below, high-grade esophageal 
dysplasia sequences that are differentially expressed include those that are up-regulated in 
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high-grade esophageal dysplasia). The differential expression of these sequences in high-grade 
esophageal dysplasia combined with the fact they have been identified in patients likely to 
develop cancer, such as adenocarcinoma, they are contributory factors in cancer. The high- 
grade esophageal dysplasia nucleic acid sequences, or the polypeptides encoded by the nucleic 
5 acids, of the invention are disclosed in Table 4 as HGD marker genes, or polypeptides, as 
follows: ET-1 (endothelin-1, NM_001955) (SEQ ID NO:l or 2); AGR2 (anterior gradient 2 
(Xenepus laevis) homolog, NM_006408) (SEQ ID NO:3 or 4); ADAM8 (NM_001109) (SEQ 
ID NO:5 or 6); PRSS8 (Prostasin precursor, serine protease, NM_002773) (SEQ ID NO:7 or 
8); AXOl (Axonin-1 precursor, NM_005076) (SEQ ID NO:9 or 10); NROB2 (Nuclear 

10 hormone receptor, NM_021969) (SEQ ID NO: 11 or 12); TM7SF1 (NM_003272) (SEQ ID 
NO: 13 or 14); DLDH (dihydrolipamide dehydrogenase, NM_000108) (SEQ ID NOS:15 or 
16); MAT2B (methionine adenosyltransferase H, beta, NM_013283) (SEQ ID NO: 17 or 18); 
STC-2 (stanniocalcin-2, NM_003714) (SEQ ID NO: 19 or 20); PPBI (alkaline phosphatase, 
intestinal precursor, NM_001631) (SEQ ID NO:21 or 22); SLNAC1 (sodium channel receptor 

15 SLNAC1, NM_004769) (SEQ ID NO:23 or 24); CAH4 (carbonic anhydrase iv precursor, 
NM_000717) (SEQ ID NO:25 or 26); PA21 (phopholipase a2 precursor, NM_000928) (SEQ 
ID NO:27 or 28); PAR2 (proteinase activated receptor 2 precursor, NM_005242) (SEQ ID 
NO:29 or 30); IDE (insulin-degrading enzyme, NM_004969) (SEQ ID NO:31 or 32); MYOIA 
(myosin-lA, NM_005379) (SEQ ID NO:33 or 34); CYP2J2 (cytochrome P450 

20 monooxygenase, NM_000775) (SEQ ID NO:35 or 36); PHYH (phytanoyl-CoA-hydroxylase 
(Refsum disease), NM_006214) (SEQ ID NO:37 or 38); CYB5 (cytochrome b5, 3' end, 
NM_001914) (SEQ ID NO:39 or 40); COXVIb (coxVIb gene, last exon and flanking 
sequence, NM_001863) (SEQ ID NO:41 or 42); and TCF4 (NM_030756) (SEQ ID NO:43 or 
44). 

25 

Definitions 

The phrases "gene amplification" and "gene duplication" are used interchangeably and 
refer to a process by which multiple copies of a gene or gene fragment are formed in a 
particular cell or cell line. The duplicated region (a stretch of amplified DNA) is often 
30 referred to as "amplicon." Usually, the amount of the messenger RNA (mRNA) produced, i.e., 
the level of gene expression, also increases in the proportion of the number of copies made of 
the particular gene expressed. 
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"Tumor", as used herein, refers to all neoplastic cell growth and proliferation, whether 
malignant or benign, and all pre-cancerous and cancerous cells and tissues. 

The terms "cancer" and "cancerous" refer to or describe the physiological condition in 
5 mammals that is typically characterized by unregulated cell growth. Examples of cancer 
include but are not limited to, carcinoma, adenocarcinoma; lymphoma, blastoma, sarcoma, and 
leukemia. More particular examples of such cancers include esophageal cancer, breast cancer, 
prostate cancer, colon cancer, squamous cell cancer, small-cell lung cancer, non-small cell 
lung cancer, gastrointestinal cancer, pancreatic cancer, glioblastoma, cervical cancer, ovarian 
10 cancer, liver cancer, bladder cancer, hepatoma, colorectal cancer, endometrial carcinoma, 
salivary gland carcinoma, kidney cancer, liver cancer, vulval cancer, thyroid cancer, hepatic 
carcinoma and various types of head and neck cancer. 

The term "diagnosis" or "diagnosing" as used herein shall refer to the determination of 
15 the nature of a case of a disease, such as by determining a gene expression profile or 
polypeptide expression profile unique to the disease or a stage of the disease. 

A "normal" tissue sample refers to tissue or cells that are not diseased as defined 
herein, such as tissue from a mammal that is not experiencing a particular disease of interest. 

20 The term "normal cell" or "normal tissue" as used herein refers to a state of a cell or tissue in 
which the cell or tissue is apparently free of an adverse biological condition when compared to 
a diseased cell or tissue having that adverse biological condition. The normal cell or normal 
tissue may be from any prokaryotic or eukaryotic organism including, but not limited to, 
bacteria, yeast, insect, bird, reptile, and any mammal including human. Where the normal 

25 tissue or cell is used as a normal control sample, it is generally from the same species as the 
test sample. Where the cell or tissue is mammalian, the cell or tissue is any cell or tissue 
including, but not limited to blood, muscle, nerve, brain, breast, heart, lung, liver, pancreas, 
spleen, thymus, esophagus, stomach, intestine, kidney, testis, ovary, uterus, hair follicle, skin, 
bone, bladder, and spinal cord. 

30 

"Treatment" is an intervention performed with the intention of preventing the 
development or altering the pathology of a disorder. Accordingly, "treatment" refers to both 
therapeutic treatment and prophylactic or preventative measures. Those in need of treatment 
include those already with the disorder as well as those in which the disorder is to be 

39 



WO 2004/044178 



PCT/US2003/036260 



prevented. In tumor (e.g., cancer) treatment, a therapeutic agent may directly decrease the 
pathology of tumor cells, or render the tumor cells more susceptible to treatment by other 
therapeutic agents, e.g., radiation and/or chemotherapy. 

5 A "pharmaceutical composition" as used herein refers to a composition comprising a 

chemotherapeutic agent for treatment of a disease combined with physiologically acceptable 
materials such as carriers, excepients, stabilzers, buffers, salts, antioxidants, hydrophilic 
polymers, amino acids, carbohydrates, ionic or nonionic uurfactants, and/or polyethylene or 
propylene glycol. The pharmaceutical composition may be in aqueous form, tablet, capsule, 

10 microcapsules, liposomes, trandermal patches, and the like. 

The "pathology" of cancer includes all phenomena that compromise the well-being of 
the patient. This includes, without limitation, abnormal or uncontrollable cell growth, 
metastasis, interference with the normal functioning of neighboring cells, release of cytokines 
15 or other secretory products at abnormal levels, suppression or aggravation of inflammatory or 
immunological response, etc. 

"Mammal" for purposes of treatment refers to any animal classified as a mammal, 
including humans, domestic and farm animals, and zoo, sports, or pet animals, such as dogs, 
20 horses, cats, cattle, pigs, sheep, etc. Preferably, the mammal is human. 

"Carriers" as used herein include pharmaceutically acceptable carriers, excipients, or 
stabilizers which are nontoxic to the cell or mammal being exposed thereto at the dosages and 
concentrations employed. Often the physiologically acceptable carrier is an aqueous pH 

25 buffered solution. Examples of physiologically acceptable carriers include buffers such as 
phosphate, citrate, and other organic acids; antioxidants including ascorbic acid; low 
molecular weight (less than about 10 residues) polypeptides; proteins, such as serum albumin, 
gelatin, or immunoglobulins; hydrophilic polymers such as polyvinylpyrrolidone; amino acids 
such as glycine, glutamine, asparagine, arginine or lysine; monosaccharides, disaccharides, 

30 and other carbohydrates including glucose, mannose, or dextrins; chelating agents such as 
EDTA; sugar alcohols such as mannitol or sorbitol; salt-forming counterions such as sodium; 
and/or nonionic surfactants such as TWEEN™, polyethylene glycol (PEG), and 
PLURONICS™. 
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Administration "in combination with" one or more further therapeutic agents includes 
simultaneous (concurrent) and consecutive administration in any order. 

The term "cytotoxic agent" as used herein refers to a substance that inhibits or prevents 
5 the function of cells and/or causes destruction of cells. The term is intended to include 
radioactive isotopes (e.g., I 131 , I 125 , Y 90 and Re 186 ), chemotherapeutic agents, and toxins such 
as enzymatically active toxins of bacterial, fungal, plant or animal origin, or fragments thereof. 

A "chemotherapeutic agent" is a chemical compound useful in the treatment of cancer. 

10 Examples of chemotherapeutic agents include adriamycin, doxorubicin, epirubicin, 5- 
fluorouracil, cytosine arabinoside ("Ara-C"), cyclophosphamide, thiotepa, busulfan, cytoxin, 
taxoids, e.g., paclitaxel (Taxol, Bristol-Myers Squibb Oncology, Princeton, NJ), and doxetaxel 
(Taxotere, Rhone-Poulenc Rorer, Antony, Rnace), toxotere, methotrexate, cisplatin, 
melphalan, vinblastine, bleomycin, etoposide, ifosfamide, mitomycin C, mitoxantrone, 

15 vincristine, vinorelbine, carboplatin, teniposide, daunomycin, carminomycin, aminopterin, 
dactinomycin, mitomycins, esperamicins (see U.S. Pat. No. 4,675,187), 5-FU, 6-thioguanine, 
6-mercaptopurine, actinomycin D, VP- 16, chlorambucil, melphalan, and other related nitrogen 
mustards. Also included in this definition are hormonal agents that act to regulate or inhibit 
hormone action on tumors such as tamoxifen and onapristone. In an embodiment, the 

20 chemotherapeutic agent of the invention is a chemical compound useful in the treatment of 
HGD, adenocarcinoma, or for inhibiting or preventing progression from the HGD to 
adenocarcinoma in a patient. 

A "growth inhibitory agent" when used herein refers to a compound or composition 
25 which inhibits growth of a cell, especially cancer cell overexpressing any of the genes 
identified herein, either in vitro or in vivo. Thus, the growth inhibitory agent is one which 
significantly reduces the percentage of cells overexpressing such genes in S phase. Examples 
of growth inhibitory agents include agents that block cell cycle progression (at a place other 
than S phase), such as agents that induce Gl arrest and M-phase arrest. Classical M-phase 
30 blockers include the vincas (vincristine and vinblastine), taxol, and topo II inhibitors such as 
doxorubicin, epirubicin, daunorubicin, etoposide, and bleomycin. Those agents that arrest Gl 
also spill over into S -phase arrest, for example, DNA alkylating agents such as tamoxifen, 
prednisone, dacarbazine, mechlorethamine, cisplatin, methotrexate, 5-fluorouracil, and ara-C. 
Further information can be found in The Molecular Basis of Cancer , Mendelsohn and Israel, 
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eds., Chapter 1, entitled "Cell cycle regulation, oncogens, and antineoplastic drugs" by 
Murakami etaL, (WB Saunders: Philadelphia, 1995), especially p. 13. 

"Doxorubicin" is an anthracycline antibiotic. The full chemical name of doxorubicin is 
5 (8S-cis)-10-[(3-amino-2,3,6-trid^^ 

6,8,ll-trihydroxy-8-(hydroxyacetyl)-l-methoxy-5,12-naphthacenedione. 

The term "cytokine" is a generic term for proteins released by one cell population 
which act on another cell as intercellular mediators. Examples of such cytokines are 

10 lymphokines, monokines, and traditional polypeptide hormones. Included among the 
cytokines are growth hormone such as human growth hormone, N-methionyl human growth 
hormone, and bovine growth hormone; parathyroid hormone; thyroxine; insulin; proinsulin; 
relaxin; prorelaxin; glycoprotein hormones such as follicle stimulating hormone (FSH), 
thyroid stimulating hormone (TSH), and luteinizing hormone (LH); hepatic growth factor; 

15 fibroblast growth factor; prolactin; placental lactogen; tumor necrosis factor-a and ~P; 
mullerian-inhibiting substance; mouse gonadotropin-associated peptide; inhibin; activin; 
vascular endothelial growth factor; integrin; thrombopoietin (TPO); nerve growth factors such 
as NGF-P; platelet-growth factor; transforming growth factors (TGFs) such as TGF-a and 
TGF-p; insulin-like growth factor-I and -II; erythropoietin (EPO); osteoinductive factors; 

20 interferons such as interferon -a, ~p, and -y; colony stimulating factors (CSFs) such as 
macrophage-CSF (M-CSF); granulocyte-macrophage-CSF (GM-CSF); and granulocyte-CSF 
(G-CSF); interleukins (ILs) such as IL-1, IL- la, EL-2, BL-3, IL-4, IL-5, IL-6, IL-7, IL-8, DL-9, 
EL-11, IL-12; a tumor necrosis factor such as TNF-a or TNF-B; and other polypeptide factors 
including LIF and kit ligand (KL). As used herein, the term cytokine includes proteins from 

25 natural sources or from recombinant cell culture and biologically active equivalents of the 
native sequence cytokines. 

The term "prodrug" as used in this application refers to a precursor or derivative form 
of a pharmaceutically active substance that is less cytotoxic to tumor cells compared to the 
30 parent drug and is capable of being enzymatically activated or converted into the more active 
parent form. See, e.g., Wilman, "Prodrugs in Cancer Chemotherapy", Biochemical Society 
Transactions , 14:375-382, 615th Meeting, Belfast (1986), and Stella et al, "Prodrugs: A 
Chemical Approach to Targeted Drug Delivery", Directed Drug Delivery , Borchardt et al, 
(ed.), pp. 147-267, Humana Press (1985). The prodrugs of this invention include, but are not 
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limited to, phosphate-containing prodrugs, thiophosphate-containing prodrugs, sulfate- 
containing prodrugs, peptide-containing prodrugs, D-amino acid-modified prodrugs, 
glysocylated prodrugs, B-lactam-containing prodrugs, optionally substituted 
phenoxyacetamide-containing prodrugs or optionally substituted phenylacetamide-containing 
5 prodrugs, 5-fluorocytosine and other 5-fluorouridine prodrugs which can be converted into the 
more active cytotoxic free drug. Examples of cytotoxic drugs that can be derivatized into a 
prodrugs form for use in this invention include, but are not limited to, those chemotherapeutic 
agents described above. 

10 An "effective amount" or therapeutically effective amount" of a polypeptide disclosed 

herein or an antagonist thereof, in reference to inhibition of neoplastic cell growth, tumor 
growth or cancer cell growth, is an amount capable of inhibiting, to some extent, the growth of 
target cells. The term includes an amount capable of invoking a growth inhibitory, cytostatic 
and/or cytotoxic effect and/or apoptosis of the target cells. An "effective amount" is an 

15 amount of an antagonist of ET-1 (endothelin-1, NM_001955) (SEQ ID NO:l or 2); AGR2 
(anterior gradient 2 (Xenepus laevis) homolog, NM_006408) (SEQ ID NO:3 or 4); ADAMS 
(NM_001109) (SEQ ID NO:5 or 6); PRSS8 (Prostasin precursor, serine protease, 
NM 002773) (SEQ ID NO:7 or 8); AXOl (Axonin-1 precursor, NM_005076) (SEQ ID NO:9 
or 10); NROB2 (Nuclear hormone receptor, NM_021969) (SEQ ID NO: 11 or 12); TM7SF1 

20 (NM_003272) (SEQ ID NO: 13 or 14); DLDH (dihydrolipamide dehydrogenase, NM_000108) 
(SEQ ID NOS:15 or 16); MAT2B (methionine adenosyltransferase II, beta, NM_013283) 
(SEQ ID NO: 17 or 18); STC-2 (stanniocalcin-2, NM_003714) (SEQ ID NO: 19 or 20); PPBI 
(alkaline phosphatase, intestinal precursor, NM_001631) (SEQ ID NO:21 or 22); SLNAC1 
(sodium channel receptor SLNAC1, NM_004769) (SEQ ID NO:23 or 24); CAH4 (carbonic 

25 anhydrase iv precursor, NM_000717) (SEQ ID NO:25 or 26); PA21 (phopholipase a2 
precursor, NM_000928) (SEQ ID NO:27 or 28); PAR2 (proteinase activated receptor 2 
precursor, NM_005242) (SEQ ID NO:29 or 30); IDE (insulin-degrading enzyme, 
NM_004969) (SEQ ID NO:31 or 32); MYOIA (myosin-lA, NM_005379) (SEQ ID NO:33 or 
34); CYP2J2 (cytochrome P450 monooxygenase, NM_000775) (SEQ ID NO:35 or 36); 

30 PHYH (phytanoyl-CoA-hydroxylase (Refsum disease), NM_006214) (SEQ ID NO:37 or 38); 
CYB5 (cytochrome b5, 3' end, NM.001914) (SEQ ID NO:39 or 40); COXVIb (coxVIb gene, 
last exon and flanking sequence, NM_001863) (SEQ ID NO:41 or 42); and TCF4 
(NM_030756) (SEQ ID NO:43 or 44) gene or polypeptide for purposes of inhibiting 
neoplastic cell growth, tumor growth or cancer cell growth, may be determined empirically 
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and in a routine manner. The terms further refer to an amount capable of invoking one or 
more of the following effects: (1) inhibition, to some extent, of tumor growth, including, 
slowing down and complete growth arrest; (2) reduction in the number of tumor cells; (3) 
reduction in tumor size; (4) inhibition (i.e., reduction, slowing down or complete stopping) of 

5 tumor cell infiltration into peripheral organs; (5) inhibition (i.e., reduction, slowing down or 
complete stopping) of metastasis; (6) enhancement of anti-tumor immune response, which 
may, but does not have to, result in the regression or rejection of the tumor; and/or (7) relief, to 
some extent, of one or more symptoms associated with the disorder. A "therapeutically 
effective amount" of an antagonist of ET-1 (endothelin-1, NM_001955) (SEQ ID NO:l or 2); 

10 AGR2 (anterior gradient 2 (Xenepus laevis) homolog, NM_006408) (SEQ ID NO:3 or 4); 
ADAMS (NM_001109) (SEQ ID NO:5 or 6); PRSS8 (Prostasin precursor, serine protease, 
NM_002773) (SEQ ID NO:7 or 8); AXOl (Axonin-1 precursor, NM_005076) (SEQ ID NO:9 
or 10); NROB2 (Nuclear hormone receptor, NM_021969) (SEQ ID NO: 11 or 12); TM7SF1 
(NM_003272) (SEQ ID NO: 13 or 14); DLDH (dihydrolipamide dehydrogenase, NM_000108) 

15 (SEQ ID NOS:15 or 16); MAT2B (methionine adenosyltransferase n, beta, NM_013283) 
(SEQ ID NO:17 or 18); STC-2 (stanniocalcin-2, NM_003714) (SEQ ID NO: 19 or 20); PPBI 
(alkaline phosphatase, intestinal precursor, NM_001631) (SEQ ID NO:21 or 22); SLNAC1 
(sodium channel receptor SLNAC1, NM_004769) (SEQ ID NO:23 or 24); CAH4 (carbonic 
anhydrase iv precursor, NM_000717) (SEQ ID NO:25 or 26); PA21 (phopholipase a2 

20 precursor, NM_000928) (SEQ ID NO:27 or 28); PAR2 (proteinase activated receptor 2 
precursor, NM_005242) (SEQ ID NO:29 or 30); IDE (insulin-degrading enzyme, 
NML004969) (SEQ ID NO:31 or 32); MYOIA (myosin-lA, NM_005379) (SEQ ID NO:33 or 
34); CYP2J2 (cytochrome P450 monooxygenase, NM_000775) (SEQ ID NO:35 or 36); 
PHYH (phytanoyl-CoA-hydroxylase (Refsum disease), NM_006214) (SEQ ID NO:37 or 38); 

25 CYB5 (cytochrome b5, 3' end, NM_001914) (SEQ ID NO:39 or 40); COXVIb (coxVIb gene, 
last exon and flanking sequence, NM_001863) (SEQ ID NO:41 or 42); or TCF4 
(NM_030756) (SEQ ID NO:43 or 44) gene or polypeptide for purposes of treatment of tumor 
may be determined empirically and in a routine manner. 

30 A "growth inhibitory amount" of a compound that inhibits growth of a cell expressing 

genes, or polypeptides, from the following group: ET-1 (endothelin-1, NM_001955) (SEQ ID 
NO:l or 2); AGR2 (anterior gradient 2 (Xenepus laevis) homolog, NM_006408) (SEQ ID 
NO:3 or 4); ADAM8 (NM_001109) (SEQ ID NO:5 or 6); PRSS8 (Prostasin precursor, serine 
protease, NM_002773) (SEQ ID NO:7 or 8); AXOl (Axonin-1 precursor, NM_005076) (SEQ 
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ID NO:9 or 10); NROB2 (Nuclear hormone receptor, NM_021969) (SEQ ID NO: 11 or 12); 
TM7SF1 (NM_003272) (SEQ ID NO: 13 or 14); DLDH (dihydrolipamide dehydrogenase, 
NM_000108) (SEQ ID NOS:15 or 16); MAT2B (methionine adenosyltransferase n, beta, 
NM_013283) (SEQ ID NO: 17 or 18); STC-2 (stanniocalcin-2, NM_003714) (SEQ ID NO: 19 
5 or 20); PPBI (alkaline phosphatase, intestinal precursor, NM_001631) (SEQ ID NO:21 or 22); 
SLNAC1 (sodium channel receptor SLNAC1, NM_004769) (SEQ ID NO:23 or 24); CAH4 
(carbonic anhydrase iv precursor, NM_000717) (SEQ ID NO:25 or 26); PA21 (phopholipase 
a2 precursor, NM_000928) (SEQ ID NO:27 or 28); PAR2 (proteinase activated receptor 2 
precursor, NM_005242) (SEQ ID NO:29 or 30); IDE (insulin-degrading enzyme, 

10 NM_004969) (SEQ ID NO:31 or 32); MYOIA (myosin-lA, NM_005379) (SEQ ID NO:33 or 
34); CYP2J2 (cytochrome P450 monooxygenase, NM_000775) (SEQ ID NO:35 or 36); 
PHYH (phytanoyl-CoA-hydroxylase (Refsum disease), NM_006214) (SEQ ID NO:37 or 38); 
CYB5 (cytochrome b5, 3' end, NM_001914) (SEQ ID NO:39 or 40); COXVIb (coxVIb gene, 
last exon and flanking sequence, NM_001863) (SEQ ID NO:41 or 42); and TCF4 

15 (NM_030756) (SEQ ID NO:43 or 44) is an amount of the compound capable of inhibiting the 
growth of a cell, especially tumor, e.g., cancer cell, either in vitro or in vivo. Optionally, the 
compound is an antagonist of the gene or polypeptide, such as an antagonist antibody or 
antagonist small organic molecule. A "growth inhibitory amount" of such a compound, for 
purposes of inhibiting neoplastic cell growth, may be determined empirically and in a routine 

20 manner. 



A "cytotoxic amount" of an ET-1 (endothelin-1, NM_001955) (SEQ ID NO:2); AGR2 
(anterior gradient 2 (Xenepus laevis) homolog, NM_006408) (SEQ ID NO:4); ADAM8 
(NM_001109) (SEQ ID NO:6); PRSS8 (Prostasin precursor, serine protease, NM_002773) 

25 (SEQ ID NO:8); AXOl (Axonin-1 precursor, NM_005076) (SEQ ID NO: 10); NROB2 
(Nuclear hormone receptor, NM 021969) (SEQ ID NO: 12); TM7SF1 (NM_003272) (SEQ ID 
NO: 14); DLDH (dihydrolipamide dehydrogenase, NM_000108) (SEQ ID NO: 16); MAT2B 
(methionine adenosyltransferase H, beta, NM_013283) (SEQ ID NO: 18); STC-2 
(stanniocalcin-2, NM_003714) (SEQ ID NO:20); PPBI (alkaline phosphatase, intestinal 

30 precursor, NM_001631) (SEQ ID NO:22); SLNAC1 (sodium channel receptor SLNAC1, 
NM_004769) (SEQ ID NO:24); CAH4 (carbonic anhydrase iv precursor, NM_000717) (SEQ 
ID NO:26); PA21 (phopholipase a2 precursor, NM_000928) (SEQ ID NO:28); PAR2 
(proteinase activated receptor 2 precursor, NMJ305242) (SEQ ID NO:30); IDE (insulin- 
degrading enzyme, NM_004969) (SEQ ID NO:32); MYOIA (myosin-lA, NM_005379) (SEQ 
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ID NO:34); CYP2J2 (cytochrome P450 monooxygenase, NM_000775) (SEQ ID NO:36); 
PHYH (phytanoyl-CoA-hydroxylase (Refsum disease), NM_006214) (SEQ ID NO:38); CYB5 
(cytochrome b5, 3' end, NM_001914) (SEQ ID NO:40); COXVIb (coxVIb gene, last exon 
and flanking sequence, NM_001863) (SEQ ID NO:42); or TCF4 (NM_030756) (SEQ ID 
5 NO:44) polypeptide antagonist is an amount capable of causing the destruction of a cell, 
especially tumor, e.g., cancer cell, either in vitro or in vivo. A "cytotoxic amount" of a such a 
polypeptide antagonist for purposes of inhibiting neoplastic cell growth may be determined 
empirically and in a routine manner. 

10 The terms ET-1 (endothelin-1, NM_001955) (SEQ ID NO:2); AGR2 (anterior gradient 

2 (Xenepus laevis) homolog, NM_006408) (SEQ ID NO:4); ADAMS (NM_001109) (SEQ ID 
NO:6); PRSS8 (Prostasin precursor, serine protease, NM_002773) (SEQ ID NO:8); AXOl 
(Axonin-1 precursor, NM_005076) (SEQ ID NO: 10); NROB2 (Nuclear hormone receptor, 
NM_021969) (SEQ ID NO: 12); TM7SF1 (NM_003272) (SEQ ID NO: 14); DLDH 

15 (dihydrolipamide dehydrogenase, NM_000108) (SEQ ID NO: 16); MAT2B (methionine 
adenosyltransferase H, beta, NM_013283) (SEQ ID NO: 18); STC-2 (stanniocalcin-2, 
NM_003714) (SEQ ID NO:20); PPBI (alkaline phosphatase, intestinal precursor, 
NM_001631) (SEQ ID NO:22); SLNAC1 (sodium channel receptor SLNAC1, NM_004769) 
(SEQ ID NO:24); CAH4 (carbonic anhydrase iv precursor, NM_000717) (SEQ ID NO:26); 

20 PA21 (phopholipase a2 precursor, NM_000928) (SEQ ID NO:28); PAR2 (proteinase activated 
receptor 2 precursor, NM_005242) (SEQ ID NO:30); IDE (insulin-degrading enzyme, 
NM_004969) (SEQ ID NO:32); MYOIA (myosin-lA, NM_005379) (SEQ ID NO:34); 
CYP2J2 (cytochrome P450 monooxygenase, NM_000775) (SEQ ID NO:36); PHYH 
(phytanoyl-CoA-hydroxylase (Refsum disease), NM_006214) (SEQ ID NO:38); CYB5 

25 (cytochrome b5, 3' end, NM_001914) (SEQ ID NO:40); COXVIb (coxVIb gene, last exon 
and flanking sequence, NM_001863) (SEQ ID NO:42); and TCF4 (NM_030756) (SEQ ID 
NO:44) polypeptide or protein when used herein encompass native sequence ET-1 
(endothelin-1, NM_001955) (SEQ ID NO:2); AGR2 (anterior gradient 2 (Xenepus laevis) 
homolog, NM_006408) (SEQ ID NO:4); ADAMS (NM_001109) (SEQ ID NO:6); PRSS8 

30 (Prostasin precursor, serine protease, NM_002773) (SEQ ID NO:8); AXOl (Axonin-1 
precursor, NM_005076) (SEQ ID NO: 10); NROB2 (Nuclear hormone receptor, NM_021969) 
(SEQ ID NO: 12); TM7SF1 (NM_003272) (SEQ ID NO: 14); DLDH (dihydrolipamide 
dehydrogenase, NM_000108) (SEQ ID NO: 16); MAT2B (methionine adenosyltransferase n, 
beta, NM_013283) (SEQ ID NO:18); STC-2 (stanniocalcin-2, NM_003714) (SEQ ID NO:20); 
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PPBI (alkaline phosphatase, intestinal precursor, NM_001631) (SEQ ID NO:22); SLNAC1 
(sodium channel receptor SLNAC1, NM_004769) (SEQ ID NO:24); CAH4 (carbonic 
anhydrase iv precursor, NM_000717) (SEQ ID NO:26); PA21 (phopholipase a2 precursor, 
NM_000928) (SEQ ID NO:28); PAR2 (proteinase activated receptor 2 precursor, 

5 NM_005242) (SEQ ID NO:30); IDE (insulin-degrading enzyme, NM_004969) (SEQ ID 
NO:32); MYOIA (myosin-lA, NM_005379) (SEQ ID NO:34); CYP2J2 (cytochrome P450 
monooxygenase, NM_000775) (SEQ ID NO:36); PHYH (phytanoyl-CoA-hydroxylase 
(Refsum disease), NM_006214) (SEQ ID NO:38); CYB5 (cytochrome b5, 3' end, 
NM_001914) (SEQ ID NO:40); COXVIb (coxVIb gene, last exon and flanking sequence, 

10 NM_001863) (SEQ ID NO:42); and TCF4 (NM_030756) (SEQ ID NO:44) polypeptide 
variants (which are further defined herein). The ET-1 (endothelin-1, NM_001955) (SEQ ID 
NO:2); AGR2 (anterior gradient 2 (Xenepus laevis) homolog, NM_006408) (SEQ ID NO:4); 
ADAMS (NM_001109) (SEQ ID NO:6); PRSS8 (Prostasin precursor, serine protease, 
NM_002773) (SEQ ID NO:8); AXOl (Axonin-1 precursor, NM_005076) (SEQ ID NO: 10); 

15 NROB2 (Nuclear hormone receptor, NM_021969) (SEQ ID NO: 12); TM7SF1 (NM 003272) 
(SEQ ID NO: 14); DLDH (dihydrolipamide dehydrogenase, NM_000108) (SEQ ID NO: 16); 
MAT2B (methionine adenosyltransferase H, beta, NM_013283) (SEQ ID NO: 18); STC-2 
(stanniocalcin-2, NM_003714) (SEQ ID NO:20); PPBI (alkaline phosphatase, intestinal 
precursor, NM_001631) (SEQ ID NO:22); SLNAC1 (sodium channel receptor SLNAC1, 

20 NM_004769) (SEQ ID NO:24); CAH4 (carbonic anhydrase iv precursor, NM_000717) (SEQ 
ID NO:26); PA21 (phophohpase a2 precursor, NM_000928) (SEQ ID NO:28); PAR2 
(proteinase activated receptor 2 precursor, NM_005242) (SEQ ID NO:30); IDE (insulin- 
degrading enzyme, NM_004969) (SEQ ID NO:32); MYOIA (myosin- 1 A, NM_005379) (SEQ 
ID NO:34); CYP2J2 (cytochrome P450 monooxygenase, NM_000775) (SEQ ID NO:36); 

25 PHYH (phytanoyl-CoA-hydroxylase (Refsum disease), NM_006214) (SEQ ID NO:38); CYB5 
(cytochrome b5, 3' end, NM_001914) (SEQ ID NO:40); COXVIb (coxVIb gene, last exon 
and flanking sequence, NM_001863) (SEQ ID NO:42); or TCF4 (NM_030756) (SEQ ID 
NO:44) polypeptide may be isolated from a variety of sources, such as from human tissue 
types or from another source, or prepared by recombinant and/or synthetic methods. 

30 

A "native sequence polypeptide" of each HGD marker polypeptide has the same amino 
acid sequence or is a polypeptide variant having at least about 80% amino acid sequence 
identity, preferably at least about 81% amino acid sequence identity, more preferably at least 
about 82% amino acid sequence identity, more preferably at least about 83% amino acid 
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sequence identity, more preferably at least about 84% amino acid sequence identity, more 
preferably at least about 85% amino acid sequence identity, more preferably at least about 
86% amino acid sequence identity, more preferably at least about 87% amino acid sequence 
identity, more preferably at least about 88% amino acid sequence identity, more preferably at 

5 least about 89% amino acid sequence identity, more preferably at least about 90% amino acid 
sequence identity, more preferably at least about 91% amino acid sequence identity, more 
preferably at least about 92% amino acid sequence identity, more preferably at least about 
93% amino acid sequence identity, more preferably at least about 94% amino acid sequence 
identity, more preferably at least about 95% amino acid sequence identity, more preferably at 

10 least about 96% amino acid sequence identity, more preferably at least about 97% amino acid 
sequence identity, more preferably at least about 98% amino acid sequence identity and most 
preferably at least about 99% amino acid sequence identity with a full-length native sequence 
polypeptide sequence, lacking the signal peptide as disclosed herein, as the ET-1 (endothelin- 
1, NM_001955) (SEQ ID NO:2); AGR2 (anterior gradient 2 (Xenepus laevis) homolog, 

15 NM_006408) (SEQ ID NO:4); ADAM8 (NM_001109) (SEQ ID NO:6); PRSS8 (Prostasin 
precursor, serine protease, NMJ)02773) (SEQ ID NO:8); AXOl (Axonin-1 precursor, 
NM_005076) (SEQ ID NO:10); NROB2 (Nuclear hormone receptor, NM„021969) (SEQ ID 
NO: 12); TM7SF1 (NM_003272) (SEQ ID NO: 14); DLDH (dihydrolipamide dehydrogenase, 
NMJ300108) (SEQ ID NO: 16); MAT2B (methionine adenosyltransferase II, beta, 

20 NM_013283) (SEQ ID NO:18); STC-2 (stanniocalcin-2, NM_003714) (SEQ ID NO:20); 
PPBI (alkaline phosphatase, intestinal precursor, NM_001631) (SEQ ID NO:22); SLNAC1 
(sodium channel receptor SLNAC1, NMJ304769) (SEQ ID NO:24); CAH4 (carbonic 
anhydrase iv precursor, NMJ300717) (SEQ ID NO:26); PA21 (phopholipase a2 precursor, 
NM_000928) (SEQ ID NO:28); PAR2 (proteinase activated receptor 2 precursor, 

25 NM_005242) (SEQ ID NO:30); IDE (insulin-degrading enzyme, NM_004969) (SEQ ID 
NO:32); MYOIA (myosin-lA, NM_005379) (SEQ ID NO:34); CYP2J2 (cytochrome P450 
monooxygenase, N1VL000775) (SEQ ID NO: 36); PHYH (phytanoyl-CoA-hydroxylase 
(Refsum disease), NM_006214) (SEQ ID NO:38); CYB5 (cytochrome b5, 3' end, 
NMJXU914) (SEQ ID NO:40); COXVIb (coxVIb gene, last exon and flanking sequence, 

30 NM__001863) (SEQ ID NO:42); or TCF4 (NM_030756) (SEQ ID NO:44) polypeptide as 
derived from nature. Such native sequence polypeptide can be isolated from nature or can be 
produced by recombinant and/or synthetic means. The term "native sequence polypeptide" 
specifically encompasses naturally-occurring truncated or secreted forms (e.g., an extracellular 
domain sequence), naturally-occurring variant forms (e.g., alternatively spliced forms) and 
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naturally-occurring allelic variants of the polypeptides encoded by a HGD marker gene as 
disclosed herein. In one embodiment of the invention, the native sequence HGD marker 
polypeptide is a mature or full-length native sequence HGD marker polypeptide as encoded by 
the nucleic acid sequences of the GenBank accession numbers listed in Table 4 A for the 

5 respective polypeptide. Also, the HGD marker polypeptides encoded by the nucleic acid 
sequences disclosed in the respective GenBank accession numbers listed in Table 4A, are 
shown to begin with the methionine residue designated therein as amino acid position 1, it is 
conceivable and possible that another methionine residue located either upstream or 
downstream from amino acid position 1 may be employed as the starting amino acid residue 

10 for HGD marker polypeptide. 

The "extracellular domain" or "ECD" of a polypeptide disclosed herein refers to a 
form of the polypeptide which is essentially free of the transmembrane and cytoplasmic 
domains. Ordinarily, a polypeptide ECD will have less than about 1% of such transmembrane 

15 and/or cytoplasmic domains and preferably, will have less than about 0.5% of such domains. 
It will be understood that any transmembrane domain(s) identified for the polypeptides of the 
present invention are identified pursuant to criteria routinely employed in the art for 
identifying that type of hydrophobic domain. The exact boundaries of a transmembrane 
domain may vary but most likely by no more than about 5 amino acids at either end of the 

20 domain as initially identified and as shown in the appended figures. As such, in one 
embodiment of the present invention, the extracellular domain of a polypeptide of the present 
invention comprises amino acids 1 to X of the mature amino acid sequence, wherein X is any 
amino acid within 5 amino acids on either side of the extracellular domain/transmembrane 
domain boundary. 

25 

The approximate location of the "signal peptides" of the various PRO polypeptides 
disclosed herein are shown in the accompanying figures. It is noted, however, that the C- 
terminal boundary of a signal peptide may vary, but most likely by no more than about 5 
amino acids on either side of the signal peptide C-terminal boundary as initially identified 
30 herein, wherein the C-terminal boundary of the signal peptide may be identified pursuant to 
criteria routinely employed in the art for identifying that type of amino acid sequence element 
(e.g., Nielsen et al. Prot. Eng. , 10:1-6 (1997) and von Heinje et al., Nucl. Acids. Res. , 
14:4683-4690 (1986)). Moreover, it is also recognized that, in some cases, cleavage of a 
signal sequence from a secreted polypeptide is not entirely uniform, resulting in more than one 
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secreted species. These mature polypeptides, where the signal peptide is cleaved within no 
more than about 5 amino acids on either side of the C-terminal boundary of the signal peptide 
as identified herein, and the polynucleotides encoding them, are contemplated by the present 
invention. 

5 

A "polypeptide variant" of any one of ET-1 (endothelin-1, NM_001955) (SEQ ID 
NO:2); AGR2 (anterior gradient 2 (Xenepus laevis) homolog, NM_006408) (SEQ ID NO:4); 
ADAM8 (NM_001109) (SEQ ID NO:6); PRSS8 (Prostasin precursor, serine protease, 
NM_002773) (SEQ ID NO:8); AXOl (Axonin-1 precursor, NM_005076) (SEQ ID NO: 10); 

10 NROB2 (Nuclear hormone receptor, NM_021969) (SEQ ID NO:12); TM7SF1 (NM_003272) 
(SEQ ID NO: 14); DLDH (dihydrolipamide dehydrogenase, NM_000108) (SEQ ID NO: 16); 
MAT2B (methionine adenosyltransferase n, beta, NM_0 13283) (SEQ ID NO: 18); STC-2 
(stanniocalcin-2, NMJ303714) (SEQ ID NO:20); PPBI (alkaline phosphatase, intestinal 
precursor, NM_001631) (SEQ ID NO:22); SLNAC1 (sodium channel receptor SLNAC1, 

15 NM_004769) (SEQ ID NO:24); CAH4 (carbonic anhydrase iv precursor, NMJ300717) (SEQ 
ID NO:26); PA21 (phopholipase a2 precursor, NM_000928) (SEQ ID NO:28); PAR2 
(proteinase activated receptor 2 precursor, NM„005242) (SEQ ID NO:30); IDE (insulin- 
degrading enzyme, NMJX)4969) (SEQ ID NO:32); MYOIA (myosin-lA, NM_005379) (SEQ 
ID NO:34); CYP2J2 (cytochrome P450 monooxygenase, NM__000775) (SEQ ID NO:36); 

20 PHYH (phytanoyl-CoA-hydroxylase (Refsum disease), NM_006214) (SEQ ID NO:38); CYB5 
(cytochrome b5, 3' end, NM_001914) (SEQ ID NO:40); COXVIb (coxVIb gene, last exon 
and flanking sequence, NM_001863) (SEQ ID NO:42); or TCF4 (NM_030756) (SEQ ID 
NO:44) polypeptide as defined above or below having at least about 80% amino acid sequence 
identity with a full-length native sequence polypeptide, with or without the signal peptide, as 

25 disclosed herein or any other fragment of a full-length HGD marker polypeptides wherein one 
or more amino acid residues are added, or deleted, at the N- or C-terminus of the full-length 
native amino acid sequence. Ordinarily, a HGD marker polypeptide variant will have at least 
about 80% amino acid sequence identity, preferably at least about 81% amino acid sequence 
identity, more preferably at least about 82% amino acid sequence identity, more preferably at 

30 least about 83% amino acid sequence identity, more preferably at least about 84% amino acid 
sequence identity, more preferably at least about 85% amino acid sequence identity, more 
preferably at least about 86% amino acid sequence identity, more preferably at least about 
87% amino acid sequence identity, more preferably at least about 88% amino acid sequence 
identity, more preferably at least about 89% amino acid sequence identity, more preferably at 
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least about 90% amino acid sequence identity, more preferably at least about 91% amino acid 
sequence identity, more preferably at least about 92% amino acid sequence identity, more 
preferably at least about 93% amino acid sequence identity, more preferably at least about 
94% amino acid sequence identity, more preferably at least about 95% amino acid sequence 
5 identity, more preferably at least about 96% amino acid sequence identity, more preferably at 
least about 97% amino acid sequence identity, more preferably at least about 98% amino acid 
sequence identity and most preferably at least about 99% amino acid sequence identity with a 
full-length native sequence polypeptide sequence lacking the signal peptide as disclosed 
herein, an extracellular domain of a HGD marker polypeptide, with or without the signal 

10 peptide, as disclosed herein or any other fragment of a full-length HGD marker polypeptide 
sequence as disclosed herein. Ordinarily, a HGD marker polypeptide variant is at least about 
10 amino acids in length, often at least about 20 amino acids in length, more often at least 
about 30 amino acids in length, more often at least about 40 amino acids in length, more often 
at least about 50 amino acids in length, more often at least about 60 amino acids in length, 

15 more often at least about 70 amino acids in length, more often at least about 80 amino acids in 
length, more often at least about 90 amino acids in length, more often at least about 100 amino 
acids in length, more often at least about 150 amino acids in length, more often at least about 
200 amino acids in length, more often at least about 300 amino acids in length, or more. 

20 "Percent (%) amino acid sequence identity" with respect to the amino acid sequence of 

any of the HGD marker polypeptides identified herein is defined as the percentage of amino 
acid residues in a candidate sequence that are identical with the amino acid residues in an ET- 
1 (endothelin-1, NMJ301955) (SEQ ID NO:2); AGR2 (anterior gradient 2 (Xenepus laevis) 
homolog, NM_006408) (SEQ ID NO:4); ADAM8 (NM_001109) (SEQ ID NO:6); PRSS8 

25 (Prostasin precursor, serine protease, NM_002773) (SEQ ID NO:8); AXOl (Axonin-1 
precursor, NM_005076) (SEQ ID NO: 10); NROB2 (Nuclear hormone receptor, NM_021969) 
(SEQ ID NO: 12); TM7SF1 (NM_003272) (SEQ ID NO: 14); DLDH (dihydrolipamide 
dehydrogenase, NM_000108) (SEQ ID NO: 16); MAT2B (methionine adenosyltransferase II, 
beta, NMJH3283) (SEQ ID NO:18); STC-2 (stanniocalcin-2, NMJXB714) (SEQ ID NO:20); 

30 PPBI (alkaline phosphatase, intestinal precursor, NMJ301631) (SEQ ID NO:22); SLNAC1 
(sodium channel receptor SLNAC1, NM_J)04769) (SEQ ID NO:24); CAH4 (carbonic 
anhydrase iv precursor, NM_000717) (SEQ ID NO:26); PA21 (phopholipase a2 precursor, 
NM_000928) (SEQ ID NO:28); PAR2 (proteinase activated receptor 2 precursor, 
NM_005242) (SEQ ID NO:30); IDE (insulin-degrading enzyme, NM_004969) (SEQ ID 
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NO:32); MYOIA (myosin-lA, NM_005379) (SEQ ED NO:34); CYP2J2 (cytochrome P450 
monooxygenase, NM_000775) (SEQ ID NO:36); PHYH (phytanoyl-CoA-hydroxylase 
(Refsum disease), NM_006214) (SEQ ID NO:38); CYB5 (cytochrome b5, 3' end, 
NM_001914) (SEQ ID NO:40); COXVIb (coxVIb gene, last exon and flanking sequence, 
NM_001863) (SEQ ID NO:42); or TCF4 (NM_030756) (SEQ ID NO:44) polypeptide, after 
aligning the sequences and introducing gaps, if necessary, to achieve the maximum percent 
sequence identity, and not considering any conservative substitutions as part of the sequence 
identity. Alignment for purposes of determining percent amino acid sequence identity can be 
achieved in various ways that are within the skill in the art, for instance, using publicly 
available computer software such as BLAST, BLAST-2, ALIGN, ALIGN-2 or Megalign 
(DNASTAR) software. Those skilled in the art can determine appropriate parameters for 
measuring alignment, including any algorithms needed to achieve maximal alignment over the 
full-length of the sequences being compared. For purposes herein, however, % amino acid 
sequence identity values are obtained as described below by using the sequence comparison 
computer program ALIGN-2, wherein the complete source code for the ALIGN-2 program is 
provided in Table 5. The ALIGN-2 sequence comparison computer program was authored by 
Genentech, Inc., and the source code shown in Table 5 has been filed with user documentation 
in the U.S. Copyright Office, Washington D.C., 20559, where it is registered under U.S. 
Copyright Registration No. TXU5 10087. The ALIGN-2 program is publicly available through 
Genentech, Inc., South San Francisco, California or may be compiled from the source code 
provided in Table 5. The ALIGN-2 program should be compiled for use on a UNIX operating 
system, preferably digital UNIX V4.0D. All sequence comparison parameters are set by the 
ALIGN-2 program and do not vary. 

For purposes herein, the % amino acid sequence identity of a given amino acid 
sequence A to, with, or against a given amino acid sequence B (which can alternatively be 
phrased as a given amino acid sequence A that has or comprises a certain % amino acid 
sequence identity to, with, or against a given amino acid sequence B) is calculated as follows: 

100 times the fraction X/Y 

where X is the number of amino acid residues scored as identical matches by the sequence 
alignment program ALIGN-2 in that program's alignment of A and B, and where Y is the total 
number of amino acid residues in B. It will be appreciated that where the length of amino acid 
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sequence A is not equal to the length of amino acid sequence B, the % amino acid sequence 
identity of A to B will not equal the % amino acid sequence identity of B to A. As examples 
of % amino acid sequence identity calculations, Tables 2A-2B demonstrate how to calculate 
the % amino acid sequence identity of the amino acid sequence designated "Comparison 
5 Protein" to the amino acid sequence designated "PRO". 

Unless specifically stated otherwise, all % amino acid sequence identity values used 
herein are obtained as described above using the ALIGN-2 sequence comparison computer 
program. However, % amino acid sequence identity may also be determined using the 

10 sequence comparison program NCBI-BLAST2 (Altschul et al. 9 Nucleic Acids Res. , 25:3389- 
3402 (1997)), The NCBI-BLAST2 sequence comparison program may be downloaded from 
http://www.ncbi.nlm.nih.gov. NCBI-BLAST2 uses several search parameters, wherein all of 
those search parameters are set to default values including, for example, unmask = yes, strand 
= all, expected occurrences = 10, minimum low complexity length = 15/5, multi-pass e-value 

15 = 0.01, constant for multi-pass = 25, dropoff for final gapped alignment = 25 and scoring 
matrix = BLOSUM62. 

In situations where NCBI-BLAST2 is employed for amino acid sequence comparisons, 
the % amino acid sequence identity of a given amino acid sequence A to, with, or against a 
20 given amino acid sequence B (which can alternatively be phrased as a given amino acid 
sequence A that has or comprises a certain % amino acid sequence identity to, with, or against 
a given amino acid sequence B) is calculated as follows: 

100 times the fraction X/Y 

25 

where X is the number of amino acid residues scored as identical matches by the sequence 
alignment program NCBI-BLAST2 in that program's alignment of A and B, and where Y is 
the total number of amino acid residues in B. It will be appreciated that where the length of 
amino acid sequence A is not equal to the length of amino acid sequence B, the % amino acid 
30 sequence identity of A to B will not equal the % amino acid sequence identity of B to A. 

In addition, % amino acid sequence identity may also be determined using the WU- 
BLAST-2 computer program (Altschul et al 9 Methods in Enzvmology , 266:460-480 (1996)). 
Most of the WU-BLAST-2 search parameters are set to the default values. Those not set to 
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default values, i.e., the adjustable parameters, are set with the following values: overlap span = 
1, overlap fraction = 0.125, word threshold (T) = 11, and scoring matrix = BLOSUM62. For 
purposes herein, a % amino acid sequence identity value is determined by dividing (a) the 
number of matching identical amino acids residues between the amino acid sequence of the 
PRO polypeptide of interest having a sequence derived from the native PRO polypeptide and 
the comparison amino acid sequence of interest (i.e., the sequence against which the PRO 
polypeptide of interest is being compared which may be a PRO variant polypeptide) as 
determined by WU-BLAST-2 by (b) the total number of amino acid residues of the PRO 
polypeptide of interest. For example, in the statement "a polypeptide comprising an amino 
acid sequence A which has or having at least 80% amino acid sequence identity to the amino 
acid sequence B", the amino acid sequence A is the comparison amino acid sequence of 
interest and the amino acid sequence B is the amino acid sequence of the PRO polypeptide of 
interest. 

As used herein, a "HGD marker" or "cancer marker gene or polypeptide," or "anti- 
[HGD marker]" or "anti- [cancer marker]" refers to any one of the genes, polypeptides encoded 
by the genes, or antibodies specific for the polypeptides described herein as diagnostic for 
HGD or cnacer. Thus, for example, "TCF4" refers to the gene marker or its encoded 
polypeptide, whereas anti-TCF4 refers to an antobidy to the TCF4-encoded polypeptide. 

A "gene variant polynucleotide" as used herein refers to a nucleic acid sequence that 
varies from the native sequence of its respective HGD marker gene NCBI accession sequence 
as disclosed in Table 4A, and further refers to a nucleic acid molecule which encodes a 
biologically active polypeptide and which nucleic acid molecule has at least about 80% 
nucleic acid sequence identity with a nucleic acid sequence selected from the group of marker 
genes: ET-1 (endothelin-1, NM_001955) (SEQ ID NO:l); AGR2 (anterior gradient 2 
(Xenepus laevis) homolog, NM_006408) (SEQ ID NO:3); ADAM8 (NM_001109) (SEQ ID 
NO:5); PRSS8 (Prostasin precursor, serine protease, NM_002773) (SEQ ID NO:7); AXOl 
(Axonin-1 precursor, NM_005076) (SEQ ID NO:9); NROB2 (Nuclear hormone receptor, 
NM_021969) (SEQ ID NO: 11); TM7SF1 (NM_003272) (SEQ ID NO: 13); DLDH 
(dihydrolipamide dehydrogenase, NM_000108) (SEQ ID NOS:15); MAT2B (methionine 
adenosyltransferase E, beta, NM_013283) (SEQ ID NO: 17); STC-2 (stanniocalcin-2, 
NM_003714) (SEQ ID NO: 19); PPBI (alkaline phosphatase, intestinal precursor, 
NM_001631) (SEQ ID NO:21); SLNAC1 (sodium channel receptor SLNAC1, NM_004769) 
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(SEQ ID NO:23); CAH4 (carbonic anhydrase iv precursor, NM_000717) (SEQ ID NO:25); 
PA21 (phopholipase a2 precursor, NM_000928) (SEQ ID NO:27); PAR2 (proteinase activated 
receptor 2 precursor, NM_005242) (SEQ ID NO:29); IDE (insulin-degrading enzyme, 
NM_004969) (SEQ ID NO:31); MYOIA (myosin-lA, NM_005379) (SEQ ID NO:33); 

5 CYP2J2 (cytochrome P450 monooxygenase, NM_000775) (SEQ ID NO:35); PHYH 
(phytanoyl-CoA-hydroxylase (Refsum disease), NM_006214) (SEQ ID NO:37); CYB5 
(cytochrome b5, 3' end, NM_001914) (SEQ ID NO:39); COXVIb (coxVIb gene, last exon 
and flanking sequence, NM_001863) (SEQ ID NO:41); and TCF4 (NM 030756) (SEQ ID 
NO:43), which genes encode, respectively, the full-length native polypeptides of the group: 

10 ET-1 (endothelin-1, NM_001955) (SEQ ID NO:2); AGR2 (anterior gradient 2 (Xenepus 
laevis) homolog, NM 006408) (SEQ ID NO:4); ADAM8 (NM 001 109) (SEQ ID NO:6); 
PRSS8 (Prostasin precursor, serine protease, NM_002773) (SEQ ID NO:8); AXOl (Axonin-1 
precursor, NM_005076) (SEQ ID NO: 10); NROB2 (Nuclear hormone receptor, NM_021969) 
(SEQ ID NO: 12); TM7SF1 (NM_003272) (SEQ ID NO: 14); DLDH (dihydrolipamide 

15 dehydrogenase, NM_000108) (SEQ ID NO: 16); MAT2B (methionine adenosyltransferase n, 
beta, NM_013283) (SEQ ID NO: 18); STC-2 (stanniocalcin-2, NM_003714) (SEQ ID NO:20); 
PPBI (alkaline phosphatase, intestinal precursor, NM_001631) (SEQ ID NO:22); SLNAC1 
(sodium channel receptor SLNAC1, NM_004769) (SEQ ID NO:24); CAH4 (carbonic 
anhydrase iv precursor, NM_000717) (SEQ ID NO:26); PA21 (phopholipase a2 precursor, 

20 NM_000928) (SEQ ID NO:28); PAR2 (proteinase activated receptor 2 precursor, 
NM_005242) (SEQ ID NO:30); IDE (insulin-degrading enzyme, NM_004969) (SEQ ID 
NO:32); MYOIA (myosin-lA, NM_005379) (SEQ ID NO:34); CYP2J2 (cytochrome P450 
monooxygenase, NM_000775) (SEQ ID NO:36); PHYH (phytanoyl-CoA-hydroxylase 
(Refsum disease), NM_006214) (SEQ ID NO:38); CYB5 (cytochrome b5, 3' end, 

25 NMJD01914) (SEQ ID NO:40); COXVIb (coxVIb gene, last exon and flanking sequence, 
NM_001863) (SEQ ID NO:42); and TCF4 (NM_030756) (SEQ ID NO:44) polypeptide 
sequence as disclosed herein, a full-length native sequence HGD marker polypeptide sequence 
lacking the signal peptide as disclosed herein, an extracellular domain of a HGD marker 
polypeptide, with or without the signal peptide, as disclosed herein or any other fragment of a 

30 full-length HGD marker polypeptide sequence as disclosed herein. Ordinarily, a HGD marker 
variant polynucleotide will have at least about 80% nucleic acid sequence identity, more 
preferably at least about 81% nucleic acid sequence identity, more preferably at least about 
82% nucleic acid sequence identity, more preferably at least about 83% nucleic acid sequence 
identity, more preferably at least about 84% nucleic acid sequence identity, more preferably at 
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least about 85% nucleic acid sequence identity, more preferably at least about 86% nucleic 
acid sequence identity, more preferably at least about 87% nucleic acid sequence identity, 
more preferably at least about 88% nucleic acid sequence identity, more preferably at least 
about 89% nucleic acid sequence identity, more preferably at least about 90% nucleic acid 
5 sequence identity, more preferably at least about 91% nucleic acid sequence identity, more 
preferably at least about 92% nucleic acid sequence identity, more preferably at least about 
93% nucleic acid sequence identity, more preferably at least about 94% nucleic acid sequence 
identity, more preferably at least about 95% nucleic acid sequence identity, more preferably at 
least about 96% nucleic acid sequence identity, more preferably at least about 97% nucleic 

10 acid sequence identity, more preferably at least about 98% nucleic acid sequence identity and 
yet more preferably at least about 99% nucleic acid sequence identity with the nucleic acid 
sequence encoding a full-length native sequence HGD marker polypeptide sequence as 
disclosed herein, a full-length native sequence HGD marker polypeptide sequence lacking the 
signal peptide as disclosed herein, an extracellular domain of a HGD marker polypeptide, with 

15 or without the signal sequence, as disclosed herein or any other fragment of a full-length HGD 
marker polypeptide sequence as disclosed herein. Variants do not encompass the native 
nucleotide sequence. 

Ordinarily, HGD marker gene variant polynucleotides are at least about 20 nucleotides 
20 in length, frequently at least about 30 nucleotides in length, often at least about 60 nucleotides 
in length, more often at least about 90 nucleotides in length, more often at least about 120 
nucleotides in length, more often at least about 150 nucleotides in length, more often at least 
about 180 nucleotides in length, more often at least about 210 nucleotides in length, more 
often at least about 240 nucleotides in length, more often at least about 270 nucleotides in 
25 length, more often at least about 300 nucleotides in length, more often at least about 450 
nucleotides in length, more often at least about 600 nucleotides in length, more often at least 
about 900 nucleotides in length, or more. 

"Percent (%) nucleic acid sequence identity" with respect to variant polypeptides of 
30 each of the HGD marker polypeptide-encoding nucleic acid sequences identified herein is 
defined as the percentage of nucleotides in a candidate sequence that are identical with the 
nucleotides in a HGD marker polypeptide-encoding nucleic acid sequence, after aligning the 
sequences and introducing gaps, if necessary, to achieve the maximum percent sequence 
identity. Alignment for purposes of determining percent nucleic acid sequence identity can be 
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achieved in various ways that are within the skill in the art, for instance, using publicly 
available computer software such as BLAST, BLAST-2, ALIGN, ALIGN-2 or Megalign 
(DNASTAR) software. Those skilled in the art can determine appropriate parameters for 
measuring alignment, including any algorithms needed to achieve maximal alignment over the 

5 full-length of the sequences being compared. For purposes herein, however, % nucleic acid 
sequence identity values are obtained as described below by using the sequence comparison 
computer program ALIGN-2, wherein the complete source code for the ALIGN-2 program is 
provided in Table 5. The ALIGN-2 sequence comparison computer program was authored by 
Genentech, Inc., and the source code shown in Table 5 has been filed with user documentation 

10 in the U.S. Copyright Office, Washington D.C., 20559, where it is registered under U.S. 
Copyright Registration No. TXU5 10087. The ALIGN-2 program is publicly available through 
Genentech, Inc., South San Francisco, California or may be compiled from the source code 
provided in Table 5. The ALIGN-2 program should be compiled for use on a UNIX operating 
system, preferably digital UNIX V4.0D. All sequence comparison parameters are set by the 

15 ALIGN-2 program and do not vary. 

For purposes herein, the % nucleic acid sequence identity of a given nucleic acid 
sequence C to, with, or against a given nucleic acid sequence D (which can alternatively be 
phrased as a given nucleic acid sequence C that has or comprises a certain % nucleic acid 
20 sequence identity to, with, or against a given nucleic acid sequence D) is calculated as follows: 

100 times the fraction W/Z 

where W is the number of nucleotides scored as identical matches by the sequence alignment 
25 program ALIGN-2 in that program's alignment of C and D, and where Z is the total number of 
nucleotides in D. It will be appreciated that where the length of nucleic acid sequence C is not 
equal to the length of nucleic acid sequence D, the % nucleic acid sequence identity of C to D 
will not equal the % nucleic acid sequence identity of D to C. As examples of % nucleic acid 
sequence identity calculations, Tables 2C-2D demonstrate how to calculate the % nucleic acid 
30 sequence identity of the nucleic acid sequence designated "Comparison DNA" to the nucleic 
acid sequence designated "PRO-DNA". 

Unless specifically stated otherwise, all % nucleic acid sequence identity values used 
herein are obtained as described above using the ALIGN-2 sequence comparison computer 
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program. However, % nucleic acid sequence identity may also be determined using the 
sequence comparison program NCBI-BLAST2 (Altschul et aL, Nucleic Acids Res. , 25:3389- 
3402 (1997)). The NCBI-BLAST2 sequence comparison program may be downloaded from 
http://www.ncbi.nlm.nih.gov. NCBI-BLAST2 uses several search parameters, wherein all of 
5 those search parameters are set to default values including, for example, unmask = yes, strand 
= all, expected occurrences = 10, minimum low complexity length = 15/5, multi-pass e-value 
= 0.01, constant for multi-pass = 25, dropoff for final gapped alignment = 25 and scoring 
matrix = BLOSUM62. 

10 In situations where NCBI-BLAST2 is employed for sequence comparisons, the % 

nucleic acid sequence identity of a given nucleic acid sequence C to, with, or against a given 
nucleic acid sequence D (which can alternatively be phrased as a given nucleic acid sequence 
C that has or comprises a certain % nucleic acid sequence identity to, with, or against a given 
nucleic acid sequence D) is calculated as follows: 

15 

100 times the fraction W/Z 

where W is the number of nucleotides scored as identical matches by the sequence alignment 
program NCBI-BLAST2 in that program's alignment of C and D, and where Z is the total 

20 number of nucleotides in D. It will be appreciated that where the length of nucleic acid 
sequence C is not equal to the length of nucleic acid sequence D, the % nucleic acid sequence 
identity of C to D will not equal the % nucleic acid sequence identity of D to C. 

In addition, % nucleic acid sequence identity values may also be generated using the 
WU-BLAST-2 computer program (Altschul et aL, Methods in Enzymology , 266 :460-480 

25 (1996)). Most of the WU-BLAST-2 search parameters are set to the default values. Those not 
set to default values, i.e., the adjustable parameters, are set with the following values: overlap 
span = 1, overlap fraction = 0.125, word threshold (T) = 11, and scoring matrix = 
BLOSUM62. For purposes herein, a % nucleic acid sequence identity value is determined by 
dividing (a) the number of matching identical nucleotides between the nucleic acid sequence 

30 of the PRO polypeptide-encoding nucleic acid molecule of interest having a sequence derived 
from the native sequence PRO polypeptide-encoding nucleic acid and the comparison nucleic 
acid molecule of interest (i.e., the sequence against which the PRO polypeptide-encoding 
nucleic acid molecule of interest is being compared which may be a variant PRO 
polynucleotide) as determined by WU-BLAST-2 by (b) the total number of nucleotides of the 



WO 2004/044178 



PCT/US2003/036260 



PRO polypeptide-encoding nucleic acid molecule of interest. For example, in the statement 
"an isolated nucleic acid molecule comprising a nucleic acid sequence A which has or having 
at least 80% nucleic acid sequence identity to the nucleic acid sequence B", the nucleic acid 
sequence A is the comparison nucleic acid molecule of interest and the nucleic acid sequence 
5 B is the nucleic acid sequence of the PRO polypeptide-encoding nucleic acid molecule of 
interest. 

In other embodiments, variants of ET-1 (endothelin-1, NM_001955) (SEQ ID NO:l); 
AGR2 (anterior gradient 2 (Xenepus laevis) homolog, NM_006408) (SEQ ID NO:3); ADAM8 

10 (NM_001109) (SEQ ID NO:5); PRSS8 (Prostasin precursor, serine protease, NM_002773) 
(SEQ ID NO:7); AXOl (Axonin-1 precursor, NM_005076) (SEQ ID NO:9); NROB2 
(Nuclear hormone receptor, NM_021969) (SEQ ID NO: 11); TM7SF1 (NM_003272) (SEQ ID 
NO:13); DLDH (dihydrolipamide dehydrogenase, NM_000108) (SEQ ID NOS:15); MAT2B 
(methionine adenosyltransferase II, beta, NM_013283) (SEQ ID NO: 17); STC-2 

15 (stanniocalcin-2, NM_003714) (SEQ ID NO: 19); PPBI (alkaline phosphatase, intestinal 
precursor, NM_001631) (SEQ ID NO:21); SLNAC1 (sodium channel receptor SLNAC1, 
NM_004769) (SEQ ID NO:23); CAH4 (carbonic anhydrase iv precursor, NM_000717) (SEQ 
ID NO:25); PA21 (phopholipase a2 precursor, NM_000928) (SEQ ID NO:27); PAR2 
(proteinase activated receptor 2 precursor, NM_005242) (SEQ ID NO:29); IDE (insulin- 

20 degrading enzyme, NM_004969) (SEQ ID NO:3 1); MYOIA (myosin- 1 A, NM_005379) (SEQ 
ID NO:33); CYP2J2 (cytochrome P450 monooxygenase, NM_000775) (SEQ ID NO:35); 
PHYH (phytanoyl-CoA-hydroxylase (Refsum disease), NM_006214) (SEQ ID NO:37); CYB5 
(cytochrome b5, 3' end, NM_001914) (SEQ ID NO:39); COXVIb (coxVIb gene, last exon 
and flanking sequence, NM_001863) (SEQ ID NO:41); or TCF4 (NM_030756) (SEQ ID 

25 NO:43) HGD marker genes encode an active HGD marker polypeptide, and nucleic acid 
sequences useful for identifying the marker genes by, for example, nucleic acid hybridization 
assays or PCR assays are capable of hybridizing, preferably under stringent hybridization and 
wash conditions, to nucleotide sequences encoding the full-length ET-1 (endothelin-1, 
NM_001955) (SEQ ID NO:l); AGR2 (anterior gradient 2 (Xenepus laevis) homolog, 

30 NML006408) (SEQ ID NO:3); ADAM8 (NM_001109) (SEQ ID NO:5); PRSS8 (Prostasin 
precursor, serine protease, NM_002773) (SEQ ID NO:7); AXOl (Axonin-1 precursor, 
NM_005076) (SEQ ID NO:9); NROB2 (Nuclear hormone receptor, NM_021969) (SEQ ID 
NO: 11); TM7SF1 (NM_003272) (SEQ ID NO: 13); DLDH (dihydrolipamide dehydrogenase, 
NM_000108) (SEQ ID NOS:15); MAT2B (methionine adenosyltransferase n, beta, 
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NM_013283) (SEQ ID NO:17); STC-2 (stanniocalcin-2, NM_003714) (SEQ ID NO:19); 
PPBI (alkaline phosphatase, intestinal precursor, NM_001631) (SEQ ID NO:21); SLNAC1 
(sodium channel receptor SLNAC1, NM_004769) (SEQ ID NO:23); CAH4 (carbonic 
anhydrase iv precursor, NM_000717) (SEQ ID NO:25); PA21 (phopholipase a2 precursor, 

5 NM_000928) (SEQ ID NO:27); PAR2 (proteinase activated receptor 2 precursor, 
NM_005242) (SEQ ID NO:29); IDE (insulin-degrading enzyme, NM_004969) (SEQ ID 
NO:31); MYOIA (myosin-lA, NMJ305379) (SEQ ID NO:33); CYP2J2 (cytochrome P450 
monooxygenase, NMJ300775) (SEQ ID NO:35); PHYH (phytanoyl-CoA-hydroxylase 
(Refsum disease), NM_006214) (SEQ ID NO:37); CYB5 (cytochrome b5, 3' end, 

10 NM_001914) (SEQ ID NO:39); COXVIb (coxVIb gene, last exon and flanking sequence, 
NMJ301863) (SEQ ID NO:41); and TCF4 (NM_030756) (SEQ ID NO:43) gene or 
hybridizable fragments thereof, which nucleotide sequences are found in the NCBI accession 
numbers listed in Table 4A for the respective polypeptides. HGD variant polypeptides may be 
those that are encoded by a HGD marker gene variant polynucleotide. 

15 ' 

The term "positives", in the context of the amino acid sequence identity comparisons 
performed as described above, includes amino acid residues in the sequences compared that 
are not only identical, but also those that have similar properties. Amino acid residues that 
score a positive value to an amino acid residue of interest are those that are either identical to 
20 the amino acid residue of interest or are a preferred substitution (as defined in Table 4A 
below) of the amino acid residue of interest. 

For purposes herein, the % value of positives of a given amino acid sequence A to, 
with, or against a given amino acid sequence B (which can alternatively be phrased as a given 
25 amino acid sequence A that has or comprises a certain % positives to, with, or against a given 
amino acid sequence B) is calculated as follows: 

100 times the fraction X/Y 

30 where X is the number of amino acid residues scoring a positive value as defined above by the 
sequence alignment program ALIGN-2 in that program's alignment of A and B, and where Y 
is the total number of amino acid residues in B. It will be appreciated that where the length of 
amino acid sequence A is not equal to the length of amino acid sequence B, the % positives of 
A to B will not equal the % positives of B to A. 

60 



WO 2004/044178 



PCT/US2003/036260 



"Isolated," when used to describe the various polypeptides disclosed herein, means 
polypeptide that has been identified and separated and/or recovered from a component of its 
natural environment. Preferably, the isolated polypeptide is free of association with all 
5 components with which it is naturally associated. Contaminant components of its natural 
environment are materials that would typically interfere with diagnostic or therapeutic uses for 
the polypeptide, and may include enzymes, hormones, and other proteinaceous or non- 
proteinaceous solutes. In preferred embodiments, the polypeptide will be purified (1) to a 
degree sufficient to obtain at least 15 residues of N-terminal or internal amino acid sequence 

10 by use of a spinning cup sequenator, or (2) to homogeneity by SDS-PAGE under non-reducing 
or reducing conditions using Coomassie blue or, preferably, silver stain. Isolated polypeptide 
includes polypeptide in situ within recombinant cells, since at least one component of the ET-1 
(endothelin-1, NM_001955) (SEQ ID NO:2); AGR2 (anterior gradient 2 (Xenepus laevis) 
homolog, NM_006408) (SEQ ID NO:4); ADAM8 (NM_001109) (SEQ ID NO:6); PRSS8 

15 (Prostasin precursor, serine protease, NM_002773) (SEQ ID NO:8); AXOl (Axonin-1 
precursor, NMJ)05076) (SEQ ID NO: 10); NROB2 (Nuclear hormone receptor, NM_021969) 
(SEQ ID NO: 12); TM7SF1 (NMJ303272) (SEQ ID NO: 14); DLDH (dihydrolipamide 
dehydrogenase, NM_000108) (SEQ ID NO: 16); MAT2B (methionine adenosyltransferase II, 
beta, NMJH3283) (SEQ ID NO: 18); STC-2 (stanniocalcin-2, NM_003714) (SEQ ID NO:20); 

20 PPBI (alkaline phosphatase, intestinal precursor, NM_001631) (SEQ ID NO:22); SLNAC1 
(sodium channel receptor SLNAC1, NM_004769) (SEQ ID NO:24); CAH4 (carbonic 
anhydrase iv precursor, NM_000717) (SEQ ID NO:26); PA21 (phopholipase a2 precursor, 
NM_000928) (SEQ ID NO:28); PAR2 (proteinase activated receptor 2 precursor, 
NM_005242) (SEQ ID NO:30); IDE (insulin-degrading enzyme, NM_004969) (SEQ ID 

25 NO:32); MYOIA (myosin-lA, NM_005379) (SEQ ID NO:34); CYP2J2 (cytochrome P450 
monooxygenase, NM_000775) (SEQ ID NO:36); PHYH (phytanoyl-CoA-hydroxylase 
(Refsum disease), NMJ306214) (SEQ ID NO:38); CYB5 (cytochrome b5, 3' end, 
NM_001914) (SEQ ID NO:40); COXVIb (coxVIb gene, last exon and flanking sequence, 
NM__001863) (SEQ ID NO:42); or TCF4 (NM_030756) (SEQ ID NO:44) polypeptide's 

30 natural environment will not be present. Ordinarily, however, isolated polypeptide will be 
prepared by at least one purification step. 



An "isolated" nucleic acid molecule encoding an ET-1 (endothelin-1, NM_001955) 
(SEQ ID NO:2); AGR2 (anterior gradient 2 (Xenepus laevis) homolog, NM_006408) (SEQ ID 

61 



WO 2004/044178 



PCT/US2003/036260 



NO:4); ADAM8 (NM_001109) (SEQ ID NO:6); PRSS8 (Prostasin precursor, serine protease, 
NM_002773) (SEQ ID NO:8); AXOl (Axonin-1 precursor, NM_005076) (SEQ ID NO: 10); 
NROB2 (Nuclear hormone receptor, NM_021969) (SEQ ID NO: 12); TM7SF1 (NM_003272) 
(SEQ ID NO: 14); DLDH (dihydrolipamide dehydrogenase, NM_000108) (SEQ ID NO: 16); 

5 MAT2B (methionine adenosyltransferase H, beta, NM_013283) (SEQ ID NO:18); STC-2 
(stanniocalcin-2, NM_003714) (SEQ ID NO:20); PPBI (alkaline phosphatase, intestinal 
precursor, NM_001631) (SEQ ID NO:22); SLNAC1 (sodium channel receptor SLNAC1, 
NMJD04769) (SEQ ID NO:24); CAH4 (carbonic anhydrase iv precursor, NM_000717) (SEQ 
ID NO:26); PA21 (phopholipase a2 precursor, NM_000928) (SEQ ID NO:28); PAR2 

10 (proteinase activated receptor 2 precursor, NMJ305242) (SEQ ID NO:30); IDE (insulin- 
degrading enzyme, NM_004969) (SEQ ID NO:32); MYOIA (myosin-lA, NM_005379) (SEQ 
ID NO:34); CYP2J2 (cytochrome P450 monooxygenase, NM_000775) (SEQ ID NO:36); 
PHYH (phytanoyl-CoA-hydroxylase (Refsum disease), NM_006214) (SEQ ID NO:38); CYB5 
(cytochrome b5, 3' end, NMJ301914) (SEQ ID NO:40); COXVIb (coxVIb gene, last exon 

15 and flanking sequence, NM_001863) (SEQ ID NO:42); or TCF4 (NM_030756) (SEQ ID 
NO:44) polypeptide or an "isolated" nucleic acid encoding an anti-[HGD marker polypeptide] 
antibody, is a nucleic acid molecule that is identified and separated from at least one 
contaminant nucleic acid molecule with which it is ordinarily associated in the natural source 
of the HGD marker genes or the anti-[HGD marker polypeptide] -encoding nucleic acid. 

20 Preferably, the isolated nucleic acid is free of association with all components with which it is 
naturally associated. An isolated polypeptide or nucleic acid sequence is other than in the 
form or setting in which it is found in nature. Isolated nucleic acid molecules therefore are 
distinguished from the nucleic acid molecule as it exists in natural cells. However, an isolated 
nucleic acid molecule encoding a HGD maker polypeptide or an anti-[HGD marker 

25 polypeptide] antibody includes HGD marker gene nucleic acid molecules and anti-[HGD 
marker polypeptide] -encoding nucleic acid molecules contained in cells that ordinarily express 
HGD marker polypeptides or express anti-[HGD maker polypeptide] antibodies where, for 
example, the nucleic acid molecule is in a chromosomal location different from that of natural 
cells. 

30 

The term "control sequences" refers to DNA sequences necessary for the expression of 
an operably linked coding sequence in a particular host organism. The control sequences that 
are suitable for prokaryotes, for example, include a promoter, optionally an operator sequence, 
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and a ribosome binding site. Eukaryotic cells are known to utilize promoters, polyadenylation 
signals, and enhancers. 

Nucleic acid is "operably linked" when it is placed into a functional relationship with 
5 another nucleic acid sequence. For example, DNA for a presequence or secretory leader is 
operably linked to DNA for a polypeptide if it is expressed as a preprotein that participates in 
the secretion of the polypeptide; a promoter or enhancer is operably linked to a coding 
sequence if it affects the transcription of the sequence; or a ribosome binding site is operably 
linked to a coding sequence if it is positioned so as to facilitate translation. Generally, 
10 "operably linked" means that the DNA sequences being linked are contiguous, and, in the case 
of a secretory leader, contiguous and in reading phase. However, enhancers do not have to be 
contiguous. Linking is accomplished by ligation at convenient restriction sites. If such sites 
do not exist, the synthetic oligonucleotide adaptors or linkers are used in accordance with 
conventional practice. 

15 

The term "antibody" is used in the broadest sense and specifically covers, for example, 
single anti-[HGD marker polypeptide] monoclonal antibodies (including antagonist, and 
neutralizing antibodies), anti-[HGD marker polypeptide] antibody compositions with 
polyepitopic specificity, single chain anti-[HGD marker polypeptide] antibodies, and 
20 fragments thereof (see below). The term "monoclonal antibody" as used herein refers to an 
antibody obtained from a population of substantially homogeneous antibodies, Le. 9 the 
individual antibodies comprising the population are identical except for possible naturally- 
occurring mutations that may be present in minor amounts. 

25 "Stringency" of hybridization reactions is readily determinable by one of ordinary skill 

in the art, and generally is an empirical calculation dependent upon probe length, washing 
temperature, and salt concentration. In general, longer probes require higher temperatures for 
proper annealing, while shorter probes need lower temperatures. Hybridization generally 
depends on the ability of denatured DNA to reanneal when complementary strands are present 

30 in an environment below their melting temperature. The higher the degree of desired 
homology between the probe and hybridizable sequence, the higher the relative temperature 
which can be used. As a result, it follows that higher relative temperatures would tend to 
make the reaction conditions more stringent, while lower temperatures less so. For additional 
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details and explanation of stringency of hybridization reactions, see Ausubel et al. 9 Current 
Protocols in Molecular Biology , Wiley Interscience Publishers, (1995). 

"Stringent conditions" or "high stringency conditions", as defined herein, may be 
5 identified by those that: (1) employ low ionic strength and high temperature for washing, for 
example 0.015 M sodium chloride/0.0015 M sodium citrate/0.1% sodium dodecyl sulfate at 
50°C; (2) employ during hybridization a denaturing agent, such as formamide, for example, 
50% (v/v) formamide with 0.1% bovine serum albumin/0.1% Ficoll/0.1% 
polyvinylpyrrolidone/50 mM sodium phosphate buffer at pH 6.5 with 750 mM sodium 
10 chloride, 75 mM sodium citrate at 42°C; or (3) employ 50% formamide, 5 x SSC (0.75 M 
NaCl, 0.075 M sodium citrate), 50 mM sodium phosphate (pH 6.8), 0.1% sodium 
pyrophosphate, 5 x Denhardt's solution, sonicated salmon sperm DNA (50 Dg/ml), 0.1% SDS, 
and 10% dextran sulfate at 42°C, with washes at 42°C in 0.2 x SSC (sodium chloride/sodium 
citrate) and 50% formamide at 55°C, followed by a high-stringency wash consisting of 0.1 x 
15 SSC containing EDTA at 55°C. 

"Moderately stringent conditions" may be identified as described by Sambrook et aL, 
Molecular Cloning: A Laboratory Manual , New York: Cold Spring Harbor Press, 1989, and 
include the use of washing solution and hybridization conditions {e.g., temperature, ionic 

20 strength and % SDS) less stringent than those described above. An example of moderately 
stringent conditions is overnight incubation at 37°C in a solution comprising: 20% formamide, 
5 x SSC (150 mM NaCl, 15 mM trisodium citrate), 50 mM sodium phosphate (pH 7.6), 5 x 
Denhardt's solution, 10% dextran sulfate, and 20 mg/ml denatured sheared salmon sperm 
DNA, followed by washing the filters in 1 x SSC at about 35DC-50°C. The skilled artisan 

25 will recognize how to adjust the temperature, ionic strength, etc. as necessary to accommodate 
factors such as probe length and the like. 

The term "epitope tagged" when used herein refers to a chimeric polypeptide 
comprising a HGD marker polypeptide fused to a "tag polypeptide". The tag polypeptide has 
30 enough residues to provide an epitope against which an antibody can be made, yet is short 
enough such that it does not interfere with activity of the polypeptide to which it is fused. The 
tag polypeptide preferably also is fairly unique so that the antibody does not substantially 
cross-react with other epitopes. Suitable tag polypeptides generally have at least six amino 
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acid residues and usually between about 8 and 50 amino acid residues (preferably, between 
about 10 and 20 amino acid residues). 

"Active" or "activity" for the purposes herein refers to form(s) of ET-1 (endothelin-1, 
5 NM_001955) (SEQ ID NO:2); AGR2 (anterior gradient 2 (Xenepus laevis) homolog, 
NMJ)06408) (SEQ ID NO:4); ADAM8 (NM_001109) (SEQ ID NO:6); PRSS8 (Prostasin 
precursor, serine protease, NM_002773) (SEQ ID NO:8); AXOl (Axonin-1 precursor, 
NM_005076) (SEQ ID NO: 10); NROB2 (Nuclear hormone receptor, NM_021969) (SEQ ID 
NO: 12); TM7SF1 (NMJ)03272) (SEQ ID NO: 14); DLDH (dihydrolipamide dehydrogenase, 

10 NM_000108) (SEQ ID NO:16); MAT2B (methionine adenosyltransferase II, beta, 
NM_013283) (SEQ ID NO:18); STC-2 (stanniocalcin-2, NM_003714) (SEQ ID NO:20); 
PPBI (alkaline phosphatase, intestinal precursor, NM_001631) (SEQ ID NO:22); SLNAC1 
(sodium channel receptor SLNAC1, NM_004769) (SEQ ID NO:24); CAH4 (carbonic 
anhydrase iv precursor, NM_000717) (SEQ ID NO:26); PA21 (phopholipase a2 precursor, 

15 NM_000928) (SEQ ID NO:28); PAR2 (proteinase activated receptor 2 precursor, 
NM_005242) (SEQ ID NO:30); IDE (insulin-degrading enzyme, NM_004969) (SEQ ID 
NO:32); MYOIA (myosin-lA, NM_005379) (SEQ ID NO:34); CYP2J2 (cytochrome P450 
monooxygenase, NMJ300775) (SEQ ID NO:36); PHYH (phytanoyl-CoA-hydroxylase 
(Refsum disease), NM_006214) (SEQ ID NO:38); CYB5 (cytochrome b5, 3' end, 

20 NM_001914) (SEQ ID NO:40); COXVIb (coxVIb gene, last exon and flanking sequence, 
NM_001863) (SEQ ID NO:42); or TCF4 (NM 030756) (SEQ ID NO:44) polypeptides which 
retain a biological and/or an immunological activity/property of a native or naturally-occurring 
HGD marker polypeptide, wherein "biological" activity refers to a function (either inhibitory 
or stimulatory) caused by a native or naturally-occurring HGD marker polypeptide other than 

25 the ability to induce the production of an antibody against an antigenic epitope possessed by a 
native or naturally-occurring HGD marker polypeptide and an "immunological" activity refers 
to the ability to induce the production of an antibody against an antigenic epitope possessed by 
a native or naturally-occurring HGD marker polypeptide. 

30 "Biological activity" in the context of an antibody or another antagonist molecule, or 

therapeutic compound that can be identified by the screening assays disclosed herein {e.g., an 
organic or inorganic small molecule, peptide, etc.) is used to refer to the ability of such 
molecules to bind or complex with the polypeptides encoded by the amplified genes identified 
herein, or otherwise interfere with the interaction of the encoded polypeptides with other 
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cellular proteins or otherwise interfere with the transcription or translation of a HGD marker 
polypeptide. "Biological activity" in the context of an agonist molecule that enhances the 
activity of, for example, native anti-angiogenic molecules refers to the ability of such 
molecules to bind or complex with the polypeptides encoded by the amplified genes identified 
5 herein or otherwise modify the interaction of the encoded polypeptides with other cellular 
proteins or otherwise enhance the transcription or translation of a TIMP1 or thrombospondin 2 
polypeptide. A preferred biological activity is growth inhibition of a target tumor cell. 
Another preferred biological activity is cytotoxic activity resulting in the death of the target 
tumor cell. 

10 

The term "biological activity" in the context of a HGD marker polypeptide means the 
typical activity of the HGD marker polypeptide in the cell. 

The phrase "immunological activity" means immunological cross-reactivity with at 
15 least one epitope of a HGD marker polypeptide. 

"Immunological cross-reactivity" as used herein means that the candidate polypeptide 
is capable of competitively inhibiting the qualitative biological activity of a HGD marker 
polypeptide having this activity with polyclonal antisera raised against the known active HGD 

20 marker polypeptide. Such antisera are prepared in conventional fashion by injecting goats or 
rabbits, for example, subcutaneously with the known active analogue in complete Freund's 
adjuvant, followed by booster intraperitoneal or subcutaneous injection in incomplete Freunds. 
The immunological cross-reactivity preferably is "specific", which means that the binding 
affinity of the immunologically cross-reactive molecule (e.g., antibody) identified, to the 

25 corresponding HGD marker polypeptide is significantly higher (preferably at least about 2- 
times, more preferably at least about 4-times, even more preferably at least about 8-times, 
most preferably at least about 10-times higher) than the binding affinity of that molecule to 
any other known native polypeptide. 

30 The term "antagonist" is used in the broadest sense, and includes any molecule that 

partially or fully blocks, inhibits, or neutralizes a biological activity of a native HGD marker 
polypeptide disclosed herein or the transcription or translation thereof, particularly when the 
HGD marker polypeptide is expressed about 1.5-fold above the level of expression in normal 
tissue controls. Suitable antagonist molecules specifically include antagonist antibodies or 
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antibody fragments, binding fragments, peptides, small organic molecules, anti-sense nucleic 
acids, etc. Included are methods for identifying antagonists of an ET-1 (endothelin-1, 
NM_001955) (SEQ ID NO:l or 2); AGR2 (anterior gradient 2 (Xenepus laevis) homolog, 
NM_006408) (SEQ ID NO:3 or 4); ADAMS (NM_001109) (SEQ ID NO:5 or 6); PRSS8 
5 (Prostasin precursor, serine protease, NM_002773) (SEQ ID NO:7 or 8); AXOl (Axonin-1 
precursor, NM_005076) (SEQ ID NO:9 or 10); NROB2 (Nuclear hormone receptor, 
NM_021969) (SEQ ID NO: 11 or 12); TM7SF1 (NM_003272) (SEQ ID NO: 13 or 14); DLDH 
(dihydrolipamide dehydrogenase, NM_000108) (SEQ ID NOS:15 or 16); MAT2B 
(methionine adenosyltransferase H, beta, NM_013283) (SEQ ID NO: 17 or 18); STC-2 

10 (stanniocalcin-2, NM_003714) (SEQ ID NO: 19 or 20); PPBI (alkaline phosphatase, intestinal 
precursor, NM_001631) (SEQ ID NO:21 or 22); SLNAC1 (sodium channel receptor 
SLNAC1, NM_004769) (SEQ ID NO:23 or 24); CAH4 (carbonic anhydrase iv precursor, 
NM_000717) (SEQ ID NO:25 or 26); PA21 (phopholipase a2 precursor, NM_000928) (SEQ 
ID NO:27 or 28); PAR2 (proteinase activated receptor 2 precursor, NM_005242) (SEQ ID 

15 NO:29 or 30); IDE (insulin-degrading enzyme, NM_004969) (SEQ ID NO:3 1 or 32); MYOIA 
(myosin-lA, NM 005379) (SEQ ID NO:33 or 34); CYP2J2 (cytochrome P450 
monooxygenase, NML000775) (SEQ ID NO:35 or 36); PHYH (phytanoyl-CoA-hydroxylase 
(Refsum disease), NM_006214) (SEQ ID NO:37 or 38); CYB5 (cytochrome b5, 3' end, 
NM_001914) (SEQ ID NO:39 or 40); COXVIb (coxVIb gene, last exon and flanking 

20 sequence, NM_001863) (SEQ ID NO:41 or 42); and TCF4 (NM_030756) (SEQ ID NO:43 or 
44) gene or polypeptide with a candidate antagonist molecule and measuring a detectable 
change in one or more biological activities normally associated with the ET-1 (endothelin-1, 
NM_001955) (SEQ ID NO:l or 2); AGR2 (anterior gradient 2 (Xenepus laevis) homolog, 
NM_006408) (SEQ ID NO:3 or 4); ADAM8 (NM_001109) (SEQ ID NO:5 or 6); PRSS8 

25 (Prostasin precursor, serine protease, NM_002773) (SEQ ID NO:7 or 8); AXOl (Axonin-1 
precursor, NM_005076) (SEQ ID NO:9 or 10); NROB2 (Nuclear hormone receptor, 
NM_021969) (SEQ ID NO: 11 or 12); TM7SF1 (NM_003272) (SEQ ID NO: 13 or 14); DLDH 
(dihydrolipamide dehydrogenase, NM_000108) (SEQ ID NOS:15 or 16); MAT2B 
(methionine adenosyltransferase n, beta, NM_013283) (SEQ ID NO: 17 or 18); STC-2 

30 (stanniocalcin-2, NM_003714) (SEQ ID NO: 19 or 20); PPBI (alkaline phosphatase, intestinal 
precursor, NM_001631) (SEQ ID NO:21 or 22); SLNAC1 (sodium channel receptor 
SLNAC1, NM_004769) (SEQ ID NO:23 or 24); CAH4 (carbonic anhydrase iv precursor, 
NM_000717) (SEQ ID NO:25 or 26); PA21 (phopholipase a2 precursor, NM_000928) (SEQ 
ID NO:27 or 28); PAR2 (proteinase activated receptor 2 precursor, NM_005242) (SEQ ID 
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NO:29 or 30); IDE (insulin-degrading enzyme, NM_004969) (SEQ ID NO:31 or 32); MYOIA 
(myosin-lA, NM 005379) (SEQ ID NO:33 or 34); CYP2J2 (cytochrome P450 
monooxygenase, NM_000775) (SEQ ID NO:35 or 36); PHYH (phytanoyl-CoA-hydroxylase 
(Refsum disease), NM„006214) (SEQ ID NO:37 or 38); CYB5 (cytochrome b5, 3 5 end, 
5 NM_001914) (SEQ ID NO:39 or 40); COXVIb (coxVIb gene, last exon and flanking 
sequence, NM_001863) (SEQ ID NO:41 or 42); and TCF4 (NMJ)30756) (SEQ ID NO:43 or 
44) gene or polypeptide. 

A "small molecule" is defined herein to have a molecular weight below about 500 
10 Daltons. 

"Antibodies" (Abs) and "immunoglobulins" (Igs) are glycoproteins having the same 
structural characteristics. While antibodies exhibit binding specificity to a specific antigen, 
immunoglobulins include both antibodies and other antibody-like molecules which lack 

15 antigen specificity. Polypeptides of the latter kind are, for example, produced at low levels by 
the lymph system and at increased levels by myelomas. The term "antibody" is used in the 
broadest sense and specifically covers, without limitation, intact monoclonal antibodies, 
polyclonal antibodies, multispecific antibodies (e.g., bispecific antibodies) formed from at 
least two intact antibodies, and antibody fragments so long as they exhibit the desired 

20 biological activity. 

"Native antibodies" and "native immunoglobulins" are usually heterotetrameric 
glycoproteins of about 150,000 daltons, composed of two identical light (L) chains and two 
identical heavy (H) chains. Each light chain is linked to a heavy chain by one covalent 

25 disulfide bond, while the number of disulfide linkages varies among the heavy chains of 
different immunoglobulin isotypes. Each heavy and light chain also has regularly spaced 
intrachain disulfide bridges. Each heavy chain has at one end a variable domain (V H ) followed 
by a number of constant domains. Each light chain has a variable domain at one end (V L ) and 
a constant domain at its other end; the constant domain of the light chain is aligned with the 

30 first constant domain of the heavy chain, and the light-chain variable domain is aligned with 
the variable domain of the heavy chain. Particular amino acid residues are believed to form an 
interface between the light- and heavy-chain variable domains. 
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The term "variable" refers to the fact that certain portions of the variable domains 
differ extensively in sequence among antibodies and are used in the binding and specificity of 
each particular antibody for its particular antigen. However, the variability is not evenly 
distributed throughout the variable domains of antibodies. It is concentrated in three segments 

5 called complementarity-determining regions (CDRs) or hypervariable regions both in the 
light-chain and the heavy-chain variable domains. The more highly conserved portions of 
variable domains are called the framework (FR) regions. The variable domains of native 
heavy and light chains each comprise four FR regions, largely adopting a p-sheet 
configuration, connected by three CDRs, which form loops connecting, and in some cases 

10 forming part of, the P-sheet structure. The CDRs in each chain are held together in close 
proximity by the FR regions and, with the CDRs from the other chain, contribute to the 
formation of the antigen-binding site of antibodies {see Kabat et al, NIH Publ. No.9 1-3242 , 
Vol. I, pages 647-669 (1991)). The constant domains are not involved directly in binding an 
antibody to an antigen, but exhibit various effector functions, such as participation of the 

15 antibody in antibody-dependent cellular toxicity. 

The term "hypervariable region" when used herein refers to the amino acid residues of 
an antibody which are responsible for antigen-binding. The hypervariable region comprises 
amino acid residues from a "complementarity determining region" or "CDR" {i.e., residues 

20 24-34 (LI), 50-56 (L2) and 89-97 (L3) in the light chain variable domain and 31-35 (HI), 50- 
65 (H2) and 95-102 (H3) in the heavy chain variable domain; Kabat et ah, Sequences of 
Proteins of Immunological Interest , 5th Ed. Public Health Service, National Institute of 
Health, Bethesda, MD. [1991]) and/or those residues from a "hypervariable loop" {i.e., 
residues 26-32 (LI), 50-52 (L2) and 91-96 (L3) in the light chain variable domain and 26-32 

25 (HI), 53-55 (H2) and 96-101 (H3) in the heavy chain variable domain ; Clothia and Lesk, J. 
Mol. Biol. , 196:901-917 [1987]). "Framework" or "FR" residues are those variable domain 
residues other than the hypervariable region residues as herein defined. 

"Antibody fragments" comprise a portion of an intact antibody, preferably the antigen 
30 binding or variable region of the intact antibody. Examples of antibody fragments include 
Fab, Fab ! , F(ab')2 9 and Fv fragments; diabodies; linear antibodies (Zapata et aL, Protein Eng. , 
8(10) : 1057-1062 [1995]); single-chain antibody molecules; and multispecific antibodies 
formed from antibody fragments. 
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Papain digestion of antibodies produces two identical antigen-binding fragments, 
called "Fab" fragments, each with a single antigen-binding site, and a residual "Fc" fragment, 
whose name reflects its ability to crystallize readily. Pepsin treatment yields an F(ab') 2 
fragment that has two antigen-combining sites and is still capable of cross-linking antigen. 

5 

"Fv" is the minimum antibody fragment which contains a complete antigen-recognition 
and -binding site. This region consists of a dimer of one heavy- and one light-chain variable 
domain in tight, non-covalent association. It is in this configuration that the three CDRs of 
each variable domain interact to define an antigen-binding site on the surface of the V h -Vl 
10 dimer. Collectively, the six CDRs confer antigen-binding specificity to the antibody. 
However, even a single variable domain (or half of an Fv comprising only three CDRs specific 
for an antigen) has the ability to recognize and bind antigen, although at a lower affinity than 
the entire binding site. 

15 The Fab fragment also contains the constant domain of the light chain and the first 

constant domain (CHI) of the heavy chain. Fab fragments differ from Fab' fragments by the 
addition of a few residues at the carboxy terminus of the heavy chain CHI domain including 
one or more cysteines from the antibody hinge region. Fab'-SH is the designation herein for 
Fab' in which the cysteine residue(s) of the constant domains bear a free thiol group. F(ab') 2 

20 antibody fragments originally were produced as pairs of Fab' fragments which have hinge 
cysteines between them. Other chemical couplings of antibody fragments are also known. 

The "light chains" of antibodies (immunoglobulins) from any vertebrate species can be 
assigned to one of two clearly distinct types, called kappa (k) and lambda (X), based on the 
25 amino acid sequences of their constant domains. 

Depending on the amino acid sequence of the constant domain of their heavy chains, 
immunoglobulins can be assigned to different classes. There are five major classes of 
immunoglobulins: IgA, IgD, IgE, IgG, and IgM, and several of these may be further divided 
30 into subclasses (isotypes), e.g., IgGl, IgG2, IgG3, IgG4, IgA, and IgA2. The heavy-chain 
constant domains that correspond to the different classes of immunoglobulins are called a, 5, 
s, y, and p,, respectively. The subunit structures and three-dimensional configurations of 
different classes of immunoglobulins are well known. 
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The term "monoclonal antibody" as used herein refers to an antibody obtained from a 
population of substantially homogeneous antibodies, i.e., the individual antibodies comprising 
the population are identical except for possible naturally occurring mutations that may be 
present in minor amounts. Monoclonal antibodies are highly specific, being directed against a 
5 single antigenic site. Furthermore, in contrast to conventional (polyclonal) antibody 
preparations which typically include different antibodies directed against different 
determinants (epitopes), each monoclonal antibody is directed against a single determinant on 
the antigen. In addition to their specificity, the monoclonal antibodies are advantageous in 
that they are synthesized by the hybridoma culture, uncontaminated by other 

10 immunoglobulins. The modifier "monoclonal" indicates the character of the antibody as being 
obtained from a substantially homogeneous population of antibodies, and is not to be 
construed as requiring production of the antibody by any particular method. For example, the 
monoclonal antibodies to be used in accordance with the present invention may be made by 
the hybridoma method first described by Kohler et al, Nature , 256 :495 [1975], or may be 

15 made by recombinant DNA methods (see, e.g., U.S. Patent No. 4,816,567). The "monoclonal 
antibodies" may also be isolated from phage antibody libraries using the techniques described 
in Clackson et al, Nature , 352:624-628 [1991] and Marks et al., J. Mol. Biol. , 222:581-597 
(1991), for example. 

20 The monoclonal antibodies herein specifically include "chimeric" antibodies 

(immunoglobulins) in which a portion of the heavy and/or light chain is identical with or 
homologous to corresponding sequences in antibodies derived from a particular species or 
belonging to a particular antibody class or subclass, while the remainder of the chain(s) is 
identical with or homologous to corresponding sequences in antibodies derived from another 

25 species or belonging to another antibody class or subclass, as well as fragments of such 
antibodies, so long as they exhibit the desired biological activity (U.S. Patent No, 4,816,567; 
Morrison et al, Proc. Natl. Acad. Sci. USA , 81:6851-6855 [1984]). 

"Humanized" forms of non-human (e.g., murine) antibodies are chimeric 
30 immunoglobulins, immunoglobulin chains or fragments thereof (such as Fv, Fab, Fab*, F(ab')2 
or other antigen-binding subsequences of antibodies) which contain minimal sequence derived 
from non-human immunoglobulin. For the most part, humanized antibodies are human 
immunoglobulins (recipient antibody) in which residues from a CDR of the recipient are 
replaced by residues from a CDR of a non-human species (donor antibody) such as mouse, rat 
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or rabbit having the desired specificity, affinity, and capacity. In some instances, Fv FR 
residues of the human immunoglobulin are replaced by corresponding non-human residues. 
Furthermore, humanized antibodies may comprise residues which are found neither in the 
recipient antibody nor in the imported CDR or framework sequences. These modifications are 

5 made to further refine and maximize antibody performance. In general, the humanized 
antibody will comprise substantially all of at least one, and typically two, variable domains, in 
which all or substantially all of the CDR regions correspond to those of a non-human 
immunoglobulin and all or substantially all of the FR regions are those of a human 
immunoglobulin sequence. The humanized antibody optimally also will comprise at least a 

10 portion of an immunoglobulin constant region (Fc), typically that of a human 
immunoglobulin. For further details, see, Jones et ah, Nature , 321 :522-525 (1986); 
Reichmann et aU Nature , 332:323-329 [1988]; and Presta, Curr. Op. Struct. Biol. , 2:593-596 
(1992). The humanized antibody includes a PRIMATIZED™ antibody wherein the antigen- 
binding region of the antibody is derived from an antibody produced by immunizing macaque 

15 monkeys with the antigen of interest. 

"Single-chain Fv" or "sFv" antibody fragments comprise the V H and V L domains of 
antibody, wherein these domains are present in a single polypeptide chain. Preferably, the Fv 
polypeptide further comprises a polypeptide linker between the V H and Vl domains which 
20 enables the sFv to form the desired structure for antigen binding. For a review of sFv see 
Pluckthun in The Pharmacology of Monoclonal Antibodies , vol. 113, Rosenburg and Moore 
eds., Springer-Verlag, New York, pp. 269-315 (1994). 

The term "diabodies" refers to small antibody fragments with two antigen-binding 
25 sites, which fragments comprise a heavy-chain variable domain (V H ) connected to a light- 
chain variable domain (Vl) in the same polypeptide chain (V H - Vl). By using a linker that is 
too short to allow pairing between the two domains on the same chain, the domains are forced 
to pair with the complementary domains of another chain and create two antigen-binding sites. 
Diabodies are described more fully in, for example, EP 404,097; WO 93/11161; and Hollinger 
30 et aU Proc. Natl. Acad. Sci. USA , 90:6444-6448 (1993). 

An "isolated" antibody is one which has been identified and separated and/or recovered 
from a component of its natural environment. Contaminant components of its natural 
environment are materials which would interfere with diagnostic or therapeutic uses for the 
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antibody, and may include enzymes, hormones, and other proteinaceous or nonproteinaceous 
solutes. In preferred embodiments, the antibody will be purified (1) to greater than 95% by 
weight of antibody as determined by the Lowry method, and most preferably more than 99% 
by weight, (2) to a degree sufficient to obtain at least 15 residues of N-terminal or internal 
amino acid sequence by use of a spinning cup sequenator, or (3) to homogeneity by SDS- 
PAGE under reducing or nonreducing conditions using Coomassie blue or, preferably, silver 
stain. Isolated antibody includes the antibody in situ within recombinant cells since at least 
one component of the antibody's natural environment will not be present. Ordinarily, 
however, isolated antibody will be prepared by at least one purification step. 

The word "label" when used herein refers to a detectable compound or composition 
which is conjugated directly or indirectly to the antibody so as to generate a "labeled" 
antibody. The label may be detectable by itself (e.g., radioisotope labels or fluorescent labels) 
or, in the case of an enzymatic label, may catalyze chemical alteration of a substrate 
compound or composition which is detectable. Radionuclides that can serve as detectable 
labels include, for example, 1-131, 1-123, 1-125, Y-90, Re-188, Re-186, At-211, Cu-67, Bi- 
212, and Pd-109. The label may also be a non-detectable entity such as a toxin. 

A "liposome" is a small vesicle composed of various types of lipids, phospholipids 
and/or surfactant which is useful for delivery of a drug (such as a CXCR4; Laminin alpha 4; 
TEMPI; Type IV collagen alpha 1; Laminin alpha 3; Adrenomedullin; Thrombospondin 2; 
Type I collagen alpha 2; Type VI collagen alpha 2; Type VI collagen alpha 3; Latent TGFbeta 
binding protein 2 (LTBP2); Serine or cystein protease inhibitor heat shock protein (HSP47); 
Procollagen-lysine, 2-oxoglutarate 5-dioxygenase; connexin 43; Type IV collagen alpha 2; 
Connexin 37; Ephrin Al; Laminin beta 2; Integrin alpha 1; Stanniocalcin 1; Thrombospondin 
4; or CD36 polypeptide or antibody thereto and, optionally, a chemotherapeutic agent) to a 
mammal. The components of the liposome are commonly arranged in a bilayer formation, 
similar to the lipid arrangement of biological membranes. 

As used herein, the term "immunoadhesin" designates antibody-like molecules which 
combine the binding specificity of a heterologous protein (an "adhesin") with the effector 
functions of immunoglobulin constant domains. Structurally, the immunoadhesins comprise a 
fusion of an amino acid sequence with the desired binding specificity which is other than the 
antigen recognition and binding site of an antibody (Le. 9 is "heterologous"), and an 
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immunoglobulin constant domain sequence. The adhesin part of an immunoadhesin molecule 
typically is a contiguous amino acid sequence comprising at least the binding site of a receptor 
or a ligand. The immunoglobulin constant domain sequence in the immunoadhesin may be 
obtained from any immunoglobulin, such as IgG-1, IgG-2, IgG-3, or IgG-4 subtypes, IgA 
5 (including IgA-1 and IgA-2), IgE, IgD or IgM. 

"Up-regulation," "increased expression," and "overexpression" are used 
interchangeably and, as used herein, mean at least about a 1.5-fold increase in expression, 
alternatively at least about a 2-fold increase in expression, alternatively with at least about a 
10 2.5-fold or higher increase in expression of a gene measured as an increase in its DNA 
(amplification), its mRNA (increased transcription), or in the level of polypeptide encoded by 
the gene. Alternatively, up-regulation or increased expression is determined using a Z score as 
a p value < 0.07 relative to a normal tissue control. 

15 The term "package insert" is used to refer to instructions customarily included in 

commercial packages of therapeutic products, that contain information about the indications, 
usage, dosage, administration, contraindications and/or warnings concerning the use of such 
therapeutic products. 

20 It will be clearly understood that, although a number of art publications are referred to 

herein, this reference does not constitute an admission that any of these documents forms part 
of the common general knowledge in the art, in Australia or in any other country. 

Throughout this specification and the claims, the terms "comprise," "comprises," and 
25 "comprising" are used in a non-exclusive sense, except where the context requires otherwise. 

EXAMPLES 

The following examples are offered by way of illustration and not by way of 
30 limitations. The examples are provided so as to provide those of ordinary skill in the art with a 
complete disclosure and description of how to make and use the compounds, compositions, 
and methods of the invention and are not intended to limit the scope of what the inventors 
regard as their invention. Efforts have been made to insure accuracy with respect to numbers 
used (e.g. amounts, temperature, etc. but some experimental errors and deviation should be 
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accounted for. Unless indicated otherwise, parts are in parts by weight, temperature is in 
degrees C, and pressure is at or near atmospheric. The disclosures of all citations in the 
specification are expressly incorported herein by reference. 

Example 1 : Patients and Tissu e Collection 

Esophageal mucosal biopsies were obtained from patients undergoing surveillance 
endoscopy at the Western General Hospital and Royal Infirmary, Edinburgh during 2000-1. 
The study was approved by the Lothian Research and Ethics Committee and written, informed 
consent was obtained from all patients. All procedures were performed by one of two 
experienced endoscopists with expertise in Barrett's esophagus in a standard manner according 
to a local protocol for Barrett's surveillance. BE was defined as tongues or circumferential 
salmon pink mucosa extending for at least 3cm above the gastroesophageal junction. At 
endoscopy, careful note was made of the length of the CE segment, severity of any esophagitis 
if present and the presence of macroscopically visible abnormalities within the BE. Data on 
smoking history, use of acid-suppressing drugs and Helicobacter pylori status were also 
recorded. 

Paired biopsies were taken. One sample was fixed in formalin for histology and the 
other stored fresh-frozen (-70°C) for microarray analysis. Two gastrointestinal pathologists 
reviewed all specimens, which were categorized as: normal squamous esophagus, BE 
(columnar lined esophagus with intestinal metaplasia and the presence of goblet cells and 
alcian blue positive mucin), BE with changes indeterminate dysplasia, BE with low-grade 
dysplasia (LGD), BE with high-grade dysplasia (HGD) or BE with adenocarcinoma (CA). For 
some patients, 2 separate biopsy specimens for the same disease state were available for array 
analysis. Additional matched samples were also analyzed {e.g. biopsies of BE adjacent to 
carcinoma in BE from the same patient). Analyzed samples included 10 normal esophagus, 28 
samples of BE from 20 patients, 6 samples of LGD from 3 patients, 3 samples indeterminate 
for dysplasia from 2 patients, 6 samples HGD from 3 patients, 10 samples of BE adjacent to 
CA (BE-CA) from 7 patients, 16 samples CA from 10 patients. 

Microarrays containing 9031 genes were generated by printing PCR products derived 
from cDNA clones (Invitrogen, California and Genentech, Inc.) on glass slides coated with 3- 
aminopropyltriethoxysilane(Aldrich, Milwaukee WI) and 1,4-phenylenediisothiocyanate 
(Aldrich, Milwaukee WI) using a robotic arrayer (Norgren Systems, Mountain View, 
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California). RNA isolation was accomplished by CsCl step gradient, (Kingston, Current 
Protocols in Molecular Biology 1:4.2.5-4.2.6 (1998)) typically 0.1 - 2 \ig of total RNA was 
obtained. Probes for array analysis were generated by conservative amplification and 
subsequent labelling as follows: double-stranded DNA generated from 0.1 \ig of total RNA 

5 (Invitrogen, Carlsbad, CA) was amplified using a single round of a modified in vitro 
transcription protocol (MEGASCript T7 from Ambion, Austin, Texas (Gelder et al., Proc. 
Natl. Acad. Sci. USA 87:1663-1667 (1990)). The resulting cRNA was used as a template to 
generate a sense DNA probe using random primers (9mers, 0.15 mg/ml), Alexa 488 dUTP or 
Alexa 546 dUTP (40 \xM and 6 \iM, respectively, Molecular Probes, Eugene, Oregon) using 

10 MMLV-derived reverse transcriptase (Invitrogen, Carlsbad, CA). A reference probe to reflect 
general epithelial cell expression was generated from 0.1 \ig of total RNA from a pool of liver, 
lung and kidney (Clontech, Palo Alto, California). Probes were hybridized to arrays overnight 
in 50% formamide / 5XSSC at 37 °C and washed the next day in 2XSSC, 0.2% SDS followed 
by 0.2XSSC, 0.2% SDS. Array images were collected using a CCD-camera based imaging 

15 system (Norgren Systems, Mountain View, California) equipped with a Xenon light source 
and optical filters appropriate for each dye. Full dynamic-range images were collected 
(Autograb, Genentech Lie) and intensities and ratios extracted using automated gridding and 
data extraction software (ghnage, Genentech Inc) built on a Matlab (the MathWorks, Natick, 
Massachusetts) platform. 

20 

Example 3: Data Analysis 

Data were sorted to identify genes expressed above background (N intensity of > 12 
where background values range from 0 - 8) in the test sample such that only meaningful ratios 

25 were included. Ratio values were further normalized for experimental scatter at different 
intensity values within each experiment by plotting log ratio versus N intensity and by fitting a 
normal distribution at each intensity level. A measure of standard deviation (Z score) around a 
mean of zero was derived for each gene in each experiment and this value was used in data 
mining. Specifically, for each microarray, data were normalized by computing Z-scores, which 

30 were obtained from a scatterplot of the logarithm of the ratio of the test and reference data 
versus the logarithm of the minimum of the test and reference data. The median of the ratio as 
a function of intensity was estimated by applying the loess algorithm to the scatterplot. The 
standard error was estimated by applying loess to the square root of the absolute residuals, and 
squaring the result to obtain the median absolute deviation (MAD), and making a 
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multiplicative correction to convert from MAD to a standard error. The Z scores were 
determined for each ratio by dividing its vertical distance from the median loess curve by the 
standard error at that intensity. 

5 A computational process useful computing Z-scores may be written in a standard high- 

level statistical language, S-Plus, as follows: 

pos.test <- test[test > 0 & ref > 0] 

pos.ref <- ref [test > 0 & ref > 0] 
10 minorder <- order(pmin(pos.test,pos.ref)) 

y <- log(pos.test[minorder] + 10) - log(pos.ref [minorder] + 10) 

x <- log(pndn(pos.test[minorder],pos.ref[minorder])) 

residuals <- loess (y ~ x)$residuals 

sqresiduals <- sqrt(abs(residuals)) 
15 sqrt.mad <- loess(sqresiduals ~ x)$fitted 

sigma <- sqrt.mad*sqrt.mad/0.6745 

zscore <- ifelse(sigma > 0,residuals/sigma,0) 

This code may be executed in a commercially available S-Plus program such as, for example, 
20 (http://wwwinsightful.com), or in a freely available substituteprogram, R (http://www.r- 
project.org). 

Example 4: Differential Expression in Barrett's Esophagus-to-Adenocarcinoma Disease 
Stages 

25 

Samples and Data Mining : 

High-quality data were obtained from > 90% of biopsy specimens, including those of 
poor RNA quality and very limited RNA quantity (eg. less than 200 ng total RNA). A data 
30 mining strategy was applied to identify genes specifically associated with the different stages 
of disease progression. Experiments were grouped into disease categories based on pathologic 
diagnosis, and these groups compared to identify genes with significant elevated expression 
for at least 25% of the samples within a disease group with respect to both the epithelial pool 
reference and the normal esophagus group. Typically, genes with elevated expression were 
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identified as those with Z scores of > 1.7 (p < 0.05) in the disease group, corresponding to 
ratio values of 2 - 20 in most cases. A total of 460 genes satisfied these criteria across the 
disease groups BE, dysplasia, and carcinoma (some genes are associated with more than one 
disease group). Selected genes (117) are listed (Tables 1, 2, 3). All dysplasia samples (high-, 
5 low-grade and indeterminate) were combined into a single group to improve data analysis, and 
the genes identified were then further inspected to determine if they were more prevalent in 
low- or high-grade dysplasia. HGD sample data were independently analyzed to determine 
gene expression profiles diagnostic for high-grade dysplasia (Table 4A). 

10 Inflammation : 

Significant expression of proinflammatory, costimulatory and inducible cytokines and 
receptors was observed in BE, dysplasia and carcinoma, and the most prevalent genes are 
listed (Table 1). Some binding partners were detected, such as putative inflammatory cytokine 

15 IL-17 family member IL-17E and its receptor EL-17BR, and SCYA20/LARC and receptor 
CCR6 (Lee et al., J. Biol. Chem. 276:1660-1664 (2001); and Baba et al., J. Biol. Chem. 
272:14893-14898 (1997)). SCYA20 is expressed in the epithelium of the small intestine and 
is chemotactic for lymphocytes and dendritic cells (Tanaka et al., Eur. J. Immunol. 29:644-642 
(1999)). Activin A is a TGF beta superfamily member that can act as a potent mediator of cell 

20 growth and differentiation and may be involved in response to injury (Munz et al., EMBO J. 
18:5205-5215 (1999)). It was co-expressed particularly in carcinoma in Barrett's samples 
with its serine-threonine kinase receptor AVRII (the type I receptor was also detected but less 
well correlated). Chemokine receptors CXCR4 and CCR7 have been detected on a variety of 
inflammatory cell types, but have also been described has highly expressed in breast tumor 

25 cells, with possible involvement in lymph node metastasis (Muller et al., Nature 410:50-56 
(2001)). In this study, CXCR4 in particular was associated with high-grade dysplasia and 
detected in some samples of adenocarcinoma. 
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TABLE 1 A Cytokines and chemokines up-regulated in BE-to- Adenocarcinoma 



NCBI RefSeq 


Gene 


BE 


D 


BE-CA 


CA 


NM_000594 




TNF-a 


* 






* 


* 


NM_002546 




Osteoprotegerin 


* 






* 




NM_002993 




GCP-2 


(*) 


*H 


(*) 


* 


NM_025240 




B7-H3 




*L 


n 


* 


NM_002995 




Lymphotactin 


n 


* 






n 


NM_005746 




PBEF 


* 








(*) 


NM_004591 




SCYA20 




(*) 




* 




NM_004843 




WSX1 




* 








NM_019618 




IL1-H1 


o 






* 


* 


NM_000418 




IL-4R 










* 


NM_022789 




IL-17E 


o 


* 




* 




NM_018725 




1L-17BR 




*H 






o 


NM_014432 




IL~20Ra 




*L 






(*) 


NM_021798 




IL-21R 


C) 






* 


* 


NM_002192 




Activin A 




(*) 


(*) 


* 


NM_001616 




AVR2, type II activin receptor 




* 






* 


NM_001105 




Activin A type I Receptor 












NM_031409 




CCR6 


(*) 






* 


* 


NM_003467 




CXCR4 




*H 






(*) 


NM_001838 




CKR7 


o 


(*) 




* 




TABLE IB Prostaglandin synthesis-related genes up-regulated in BE-to- Adenocarcinoma 


NCBI RefSeq 




Gene 




BE 


D 


BE-CA 


CA 


NM_000963 


COX-2, prostaglandin synthase 2 




(*) 


*H 




* 


NM_000962 


COX-1, prostaglandin synthase 1 










* 


NM_007366 


PLA2R phosphlipase A2 R1 






* 


/*\ 


* 


NM_000953 


PD2R prostaglandin D2 R 




(*) 




/*\ 




NM_000959 


PF2AR prostaglandin F2a R 






* 


/*\ 


n 


NM_000957 


PER3 prostaglandin E R 2 








/*\ 


* 


NM_000960 


Prostaglindin IP (12) R 




* 


* 


/*\ 
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Genes are associated with the disease states B3, dysplasia (D), BE adjacent to carcinoma (BE- 
CA), or carcinoma (CA) if present in at least 25% of samples tested. (*) indicates gene 
expression changes associated wiht 15-25% of samples. 

5 An otherwise rare IL-1 homolog, IL1-H1, was highly expressed in carcinoma in 

Barrett's, and also the matched adjacent BE tissue from the same patients (Fig. 1). A previous 
study of the murine I1-1H1 ortholog detected constitutive only in esophageal squamous 
mucosa. In addition, human IL1-H1 mRNA could be induced in TNFD and IFND treated 
keratinocytes and squamous epithelial tumor cell line A431 (Kumar et aL, J. Biol. Chem. 
10 275:10308-10314 (2000)). This gene is one marker of a specific esophageal squamous cell 
type exhibiting a striking induction of expression in both adenocarcinoma and patient-matched 
BE, amidst primarily intestinal and tumor markers observed in this study (Tables 2 and 3). The 
high expression in BE matched with adenocarcinoma in addition to adenocarcinoma suggests 
a possible epigenetic association. 

15 

Cylooxyengase isoform 2 (COX-2), which catalyzes a rate-limiting step in conversion 
of arachidonate to inflammatory prostaglandins, has been implicated in Barrett's metaplasia 
and other cancers (Morris et al., Am. J. Gastroenterol. 96:990-996 (2001); Heasley et aL, J. 
Biol. Chem. 272:14501-14504 (1997); and Tsujii et al., Cell 93:705-716 (1998)). Consistent 

20 with previous reports, a significant increase was observed in COX-2 gene expression with 
increasing dysplasia (high-grade dysplasia) and in adenocarcinoma (Table IB). Smaller 
changes were also observed in COX-1 and several prostaglandin receptors. Arachidonic acid is 
released from the membrane by the action of phospholipases. Phospholipase A2 expression 
associated with increasing malignancy was also observed (Table 2) along with the M-type 

25 receptor (PLA2R, Table IB), consistent with studies suggesting that COX-2, PA2 and PLA2R 
are coordinately expressed (Rys-Sikora et al., Am. Physiol. Cell Physiol. 278:822-833 (2000)). 

Elevated expression was detected for another enzyme that generates a different class of 
biologically active eicosanoids from arachidonic acid, the epoxygenase CYP2J2 (Fig. IB, 
30 Table 2). This cytochrome P450 enzyme is expressed in a variety of cell types in the small 
intestine, including epithelial cells, and may play a role in electrolyte transport, intestinal 
motility, and other processes (Wu et aL, J. Biol. Chem. 271:3460-3468 (1996); Zeldin et al., 
Mol. Pharm. 51:931-943 (1997); and Node et al., Science 285:1276-1279 (1999)). Similar to 
COX-2, elevated expression is most apparent in samples of adenocarcinoma and dysplasia 
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(both low-grade and high-grade dysplasia). The expression profile for CYP2J2 also reflects the 
progressive intestinal metaplasia observed in this study (Table 2). 

Intestinal Metaplasia : 

5 

Analysis for gene expression changes associated with dysplasia revealed a large group 
of genes whose normal expression is primarily associated with the small intestine, and to a 
lesser extent, colon (Table 2). The previously described marker villin was detected, (Peterson 
and Moosekar, J. Cell Sci. 102:581-600 (1992)) along with a diverse set of genes including 

10 cell surface cadherins and claudins, ion channels and transporters, and enzymes, many of 
which are normally associated with structural and absorptive functions of small intestinal villi. 
Increased expression of many of these genes was associated with dysplasia and a significant 
subset of carcinoma samples, with differential expression also detected in a smaller subset of 
BE samples. Furthermore, expression of the majority of genes was less prevalent in matched 

15 BE samples taken from the carcinoma patients, even when expression was apparent in the 
tumor sample (Fig. 2A, 2B, 3A; Table 2). This suggests that these gene expression changes are 
more specifically associated with the foci of dysplasia and developing carcinoma within the 
larger region of BE. 
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Genes are associated with the disease states B3, dysplasia (D), BE adjacent to carcinoma (BE- 
CA), or carcinoma (CA) if present in at least 25% of samples tested. (*) indicates gene 
expression changes associated wiht 15-25% of samples. 

5 

Normal Tissues: highest normal tissue expression is listed. SI (small intestine); C 
(colon); St (stomach); K (kidney); P (pancreas); L (liver); M (muscle); H (heart); CNS (central 
nervous system); Sl-ent (intestinal enterocytes); St-par (parietal cells; O (other tissues). In the 
dysplasia column, H or L denote expression associated with high-grade or low-grade 
10 dysplasia, respectively. GPCR (G protein coupled receptor), "na" and "aa" refer to the 
nucleic acid and amino acid SEQ ID NO, respectively, for the associated markers. 

Examples include MYOIA, an unconventional myosin that is differentially expressed 
along with crypt- villus axis, exhibiting low level cytosolic expression in immature crypts and 

15 high expression in villus cells with localization at the brush border (Skowron et al., Cell Motil 
Cytoskel. 41:308-324 (1998); and MacLennan et al., Molec. Carcinogen. 24:137-143 (1999)). 
Unlike villin, another marker of the brush border that was detected across all disease states, 
MYOIA was most associated with high-grade dysplasia and carcinoma. The novel secreted 
factor AGR2 gives one of the most striking profiles as a marker for high-grade dysplasia 

20 (Figure 2A). AGR2 is a human homolog of the X laevis cement gland gene XAG-2, which is 
implicated in ectodermal patterning (Aberger et al., Mech. Dev. 72:115-130 (1998)). Elevated 
expression of this gene is also associated with hormonally-responsive high-grade esophageal 
dysplasias (Thompson and Weigel, Biochem. Biophys. Res. Commun. 251:111-116 (1998)). 

25 Expression of nuclear hormone receptor NROB2 is induced by bile acids, and NROB2 

in turn participates in transcriptional repression of the rate-limiting enzyme (CYP7A1) in bile 
synthesis (Lu et al., Mol. Cell 6:507-515 (2000)). In this study, overexpression of NROB2 is 
detected in particularly in high-grade dysplasia, in addition to some carcinomas and a subset of 
BE samples (Figure 2B). In addition to supporting the general pattern of intestinal metaplasia, 

30 expression of NROB2 may further reflect the response to the unnatural exposure of esophageal 
cells to bile, which is considered to be a contributing factor in Barrett's metaplasia (Bremner 
et al, Surgery 68:209-216 (1970); and Gillen et al., Br. J. Surg. 75:1352-1355 (1988)). Bile 
acids have also been shown to activate transcription of COX-2 (Zhang et al., J. Biol. Chem. 
273:2424-2428 (1998)). 
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While these gene expression profiles are consistent with the observations of an 
increased columnar cell type in BE, the most consistent changes are associated with dysplasia, 
especially high-grade dypslasia (Table 2). These genes could serve as markers for progression 

5 in a clinical setting. For example, the number of genes which meet the described criteria for 
elevated expression in individual samples progressively increases through BE and dysplasia. 
The average of the number of markers detected per sample is 7.6 for BE, 11.7 for low-grade 
dysplasia, and 16.4 for high-grade dysplasia. Within the BE group, 3 samples have unusually 
high scores of 12, 12, and 14 markers detected. The two samples with 12 markers are different 

10 biopsies from the same patient: while the overall expression profiles vary between the 2 
biopsies, they score identically in the marker analysis. Marker selection could be further 
refined to a subset associated with particular disease stages. This type of quantitative analysis 
may be of utility in identifying BE patients with greater risk of progression, and may be less 
sensitive to sampling and observer-related effects. Some of the secreted and processed factors 

15 listed (Table 1A, 2, 3) may even be detectable in the blood, which could further simplify 
screening. 

Adenocarcinoma : 

20 Many of the genes differentially expressed in adenocarcinoma in Barrett's, similar to 

other solid tumors, reflect the changes occurring as the cells acquire a more proliferative and 
invasive phenotype (Table 3). Included are genes involved with growth, cell adhesion, matrix 
invasion, vascularization, and intracellular remodeling. The majority of genes are most 
prevalent in adenocarinoma, but some are also detected at earlier stages. For example, genes 

25 likely to be involved in tumor angiogenesis showed significant 

upregulation in samples with dysplasia (eg. tumor endothelial marker 1 (TEM1), Tie2 ligand 
2, VEGFC, endothelin 1). 
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NCBI RefSeq 


Gene families/genes 


BE 


D 


BE-CA 


CA 




Growth factors / receptors 










NM_005228 




EGFR 




(* H) 




* 


NM_004442 




EPHB2 








* 


NM_003212 




CRIPTO CR-1 


n 


* 




* 


NM 004429 




Ephrin B1 








* $ 




Metailoproteinases - related 










mm m«iRc; 




MMP-17/ MT4-MMP 








* 


NM_021801 




MMP26 


o 


D 


O 


*$ 


NM_001110 




ADAM 10 






* 


* 


NM_001109 




ADAM8 




* H 




(*) 
\ / 


XM_132370# 




ADAM1 




* 




H 

V / 


NM_003254 




TIM1 


* 


* 


* 


* 




Intracellular cytoskeletal 










NM_001665 




rho G 


o 




* 


* 


NM_006113 




VAV3 
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Genes are associated with the disease states B3, dysplasia (D), BE adjacent to carcinoma (BE- 



CA), or carcinoma (CA) if present in at least 25% of samples tested. (*) indicates gene 
expression changes associated wiht 15-25% of samples. 
$ indicates a target of the Wnt signalling pathway. 
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The gene expression profiles in Barrett's adenocarcinoma share many similarities with 
colon tumors. For example, epidermal growth factor receptor (EGFR; previously described in 
carcinoma in BE) (ak-Kasspooles et aL, Internal J. Cancer 54:213-219 (1993), along with 
other growth factor-related or cell-surface proteins such as Cripto CR1, EPHB2, MUC1, 
5 NCA/CEACAM6, CEA (Table 3), are often highly expressed in colon cancer (Ciardiello et aL, 
Proc. Natl. Acad. Sci. USA 88:7792-7796 (1991); Liu et aL, Cancer 94:934-939 (2002); 
Zimmerman et aL, Proc. Natl. Acad. Sci. USA 84:2960-2964 (1987); Medina et aL, Cancer 
Res. 59:1061-1070 (1999); and Hantzis et aL, Neoplasia 4:151-163 (2002)). The sodium 
channel associated with cystic fibrosis, CFTR, was upregulated in adenocarcinoma and can be 

10 detected in some cases of high-grade dysplasia (Table 2). This gene is also overexpressed in 
colon tumors. Furthermore, there is evidence that several genes listed are targets of Wnt 
signalling pathways (Table 3) (Tetsu and McCormick, Nature 398:422-426 (1999); Miwa et 
aL, Oncol. Res. 12:469-476 (2000); Marchenko et aL, Biochem. J. 363:253-262 (2002); Sagara 
et aL, Biochem. and Biophys. Res. Comm. 252:117-122 (1998); Lescher et aL, Dev. Dyn. 

15 213:440-451 (1998); Willert et aL, BMC Dev. Biol. 2:1-6 (2002); and Tice et aL, J. Biol. 
Chem. 277:14329-14335 (2002)), and it is possible that COX-2, which is implicated in colon 
cancer as well as adenocarcinoma in Barrett's, is a Wnt pathway target (Howe et aL, Cancer 
Res. 59:1572-1577 (1999)). An additional synergistic link is suggested by the recent finding 
that EGFR is activated by prostaglandin E2, a product of COX-2 (Tsujii et aL, Cell 93:705- 

20 716 (1998); Tsujii et aL, Proc. Natl. Acad. Sci. USA 94:3336-3340 (1997); and Pai et aL, 
Nature Med. 8:289-293 (2002)). 



More support for Wnt/beta catenin-like induction comes from the strong induction of 
transcription factor and TCF4 (TCF7L2) in several dysplasia and adenocarcinoma samples 
25 (Figure 3A). Knockout studies in mice indicate that TCF4 is necessary for the maintenance of 
proliferative crypts in the small intestine, and constitutive acitivity of TCF4 in APC-deficient 
human epithelial cells may contribute to their malignant transformation (Korinek et aL, Nature 
Gen. 19:379-383 (1998)). Given its role in colon carcinogenesis, TCF4 provides another key 
link between intestinal metaplasia and carcinoma in BE. 

30 

Most genes listed represent known genes, but the novel gene FLJ23399 was one of the 
genes most consistently observed in adenocarcinoma and patient-matched adjacent BE 
samples (Figure 3B). Expression in BE adjacent to carcinoma suggests the induction may be 
epigenetic, or possibly reflect small foci of adencarcinoma that cannot be identified 
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histologically. Increased expression of this gene was also discovered herein to be associated 
with colon tumors, and with metastatic prostate tumors (increased expression with metastasis 
as compared to primary tumors). Its function is unknown, but the presence of 4 type HI 
fibronectin domains in the putative extracellular region suggest a possible role in cell adhesion 
5 and/or cell-matrix interactions. 

Barrett's Esophagus-to- Adenocarcinoma Disease Progression : 

Despite the difficulties associated with sampling and interpretation, the presence and 

10 degree of dysplasia is still the most predictive factor for risk of progression to adenocarinoma 
(Miros et al., Gut 32:1441-1446 (1991)). Foci of carcinoma typically appear adjacent to 
dysplasia, and esophageal resections of high-grade dysplasia frequently contain previously 
unrecognized adenocarcinoma (Falk et al., Gastrointest. Endosc. 49:170-176 (1999); and 
Cameron and Carpenter, Am. J. Gastroenterol. 92:586-591 (1997)). In this study, by the time 

15 dysplasia was apparent, there was evidence of progressive development toward a gene 
expression profile similar to a differentiated small intestinal enterocyte (along with a small 
group of genes representative of other intestinal cell types). A possible key contributing factor 
is the increased expression of TCF4 with advancing disease. Homozygous disruption of TCF4 
in mice results in death shortly after birth, and the neonatal epithelium is composed only of 

20 non-dividing villus cells (Korinek, V. et al., Nature Gen. 19:379-383 (1998)). This suggests 
that the genetic program controlled by TCF4 maintains, and possibly establishes, the crypt 
stem cells of the small intestine. In humans, TCF4 is expressed strongly in the crypts in early 
fetal development, with increasing expression on the villi up to week 22 as the small intestine 
develops (Barker et al., Am. J. Pathol. 154:29-35 (1999)). TCF4 is also expressed along the 

25 crypt- villus axis of adult small intestine and along the epithelial lining of the crypts of adult 
colon. The TCF4 profile observed in dysplasia and carcinoma in BE may reflect the 
inappropriate activation of a developmental pathway with a possible underlying dynamic and 
differentiating stem cell-like population, or acquisition of some of these characteristics. The 
delicate cells of the small intestine, with their specialized absorptive and digestive functions 

30 and rapid turnover, would seem highly susceptible to damage in the context of the esophagus 
and gastrointestinal reflux disease. 

The developing intestinal phenotype apparent by progression to dysplasia, associated 
with increased expression of TCF4, suggests some tantalizing links to the development of 
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carcinoma and the similarities in gene expression between adenocarcinoma of the esophagus 
and colon. In the context of loss of APC function, association of beta catenin with TCF4 
results in constitutive transcription of Tcf target genes, a proposed crucial event in the early 
transformation of colonic epithelia in colon cancer (Korinek et al., Science 275:1784-1787 

5 (1997)). While there is not strong evidence of truncating mutations in APC or oncogenic beta 
catenin in esophageal adenocarcinoma, there is evidence of hypermethylation of the APC 
promoter (in 48/52 of adenocarcinoma patients and 17/43 patients with BE metaplasia) 
(Kawakami et al., J. Natl. Cancer Inst. 92:1805-1811 (2000)). APC hypermethylation has also 
been implicated in progression in colon cancer (Hiltunen et al., Int. J. Cancer 70:644-648 

10 (1997)). In this context, it is interesting to note that elevated c-Fos expression was apparent in 
our study in both dysplasia and carcinoma (Table 3). This could perhaps be related to the 
presence of bile acids from reflux, overexpression of proglucagon-derived peptide GLP2 
(Table 2), or of TNFa (Table 1), all of which have been shown to induce c-Fos expression 
(Bakin and Curran, Science 283:387-390 (1999); Di Toro et al., Eur. J. Pharm. ScL 11:291- 

15 298 (2000); and Bjerknes and Cheng, Proc. Natl. Acad. Sci. USA 98:12497-12502 (2001)). 
One proposal for oncogenic transformation by c-Fos is hypermethylation resulting from 
induction of DNA 5-methylcytosine transferase (Goetze et al., Atherosclerosis 159:93-101 
(2001)). These factors may contribute to a potential increased availability of beta catenin to 
combine with TCF4 and activate transcriptional pathways that contribute to carcinogenesis, c- 

20 Fos may play an earlier role in intestinal metaplasia as well: studies of intestinal development 
in mice indicate that GLP2-mediated induction of c-Fos in enteric neurons signals growth of 
columnar epithelial cell progenitors and stem cells (Di Toro et al., Eur. J. Pharm. Sci. 11:291- 
298 (2000)). 

25 Gene expression profiling of esophageal biopsies has revealed several intriguing 

associations for the progression of malignancy in the context of Barrett's esophagus. Many of 
the genes may be involved in potentiating regulatory cycles, and there is potential synergy for 
the development of adenocarcinoma between exposure to damaging agents (eg. bile), 
inflammatory response and prostaglandin synthesis, intestinal metaplasia and TCF4 induction, 

30 along with induction of growth factors such as EGFR and oncogenes such as c-Fos. Subsets of 
the genes identified may also eventually serve as markers to identify patients at higher risk for 
adenocarcinoma. This could permit streamlining of expensive and time-consuming 
surveillance programs, along with earlier detection and associated improved survival chances 
for high-risk patients. 
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Diagnosis of High-grade Esophageal Dysplasia and Prognosis of Esophageal 
Adenocarcinoma : 

Several HGD gene markers were discovered as being up-regulated at least 1.5-fold in 
many high-grade dysplasia samples but are up-regulated in relatively few Barrett's esophagus 
samples (see Table 4A compared to Table 4B). According to the invention, where at least 
eight of the twenty-two HGD gene markers are detected to be up-regulated at 1. 5-fold in an 
esophageal tissue sample, cells of the tissue sample are said to exhibit HGD. In addition, the 
patient from whom the sample was taken may be diagnosed as experiencing high-grade 
esophageal dysplasia. Further, the prognosis for the patient includes the likely development of 
adenocarcinoma. Based on the detection of HGD, diagnosis and prognosis, the patient may be 
treated accordingly and at an earlier stage in the BE-to-cancer progression than would 
otherwise have occurred prior to disclosure of the instant invention. Alternatively, in a test 
esophageal tissue sample, where at least one of the at least eight up-regulated HGD marker 
genes is AGR2 (SEQ ID NO:3), TM7SF1 (SEQ ID NO: 13), MAT2B (SEQ ID NO: 17), 
SLNAC1 (SEQ ID NO:23), or TCF4 (SEQ ID NO:43), cells of the tissue sample exhibit HGD 
and the the patient is said to be diagnosed as experiencing dysplasia, particularly high-grade 
dysplasia, and is likely to develop adenocarcinoma. 
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In addition to detecting and diagnosing HGD and developing a prognosis of 
esophageal adenocarcinoma, treatment of cancer, including, but not limited to 
adenocarcinoma, esophageal adenocarcioma, and colon cancer is also possible by 
administering to a patient a therapeutically effective amount of an antagonist of one or more of 

5 the following adenocarcinoma marker polypeptides: CAD 17 (liver-intestine cadherin, 
NM_004063) (SEQ ID NO:46), CLDN15 (claudin 15, NM 014343) (SEQ ID NO:48), 
SLNAC1 (sodium channel, NM_004769) (SEQ ID NO:24), CFTR (chloride channel, 
NM_000492) (SEQ ID NO:50), H2R (histamine H2 receptor, NM_022304) (SEQ ID NO:52), 
PRSS8 (serine protease, NM_002773) (SEQ ID NO:8), PA21 (phospholipase A2 group IB, 

10 NM_000928) (SEQ ID NO:28), AGR2 (anterior gradient 2 homolog, (NM_006408) (SEQ ID 
NO:4), EGFR (NM_005228) (SEQ ID NO:54), EPHB2 (NM_004442) (SEQ ID NO:56), 
CRDPTO CR-1 (NM_003212) (SEQ ID NO:58), Eprin Bl (NM_004429) (SEQ ID NO:60), 
MMP- 1 7/MT4-MMP (NM_016155) (SEQ ID NO:62), MMP26 (NM_021801) (SEQ ID 
NO:64), ADAM 10 (NM_001110) (SEQ ID NO:66), ADAM8 (NM_001 109) (SEQ ID NO:6), 

15 ADAM1 (XM_1 32370) (SEQ ID NO:68), TEV11 (NM_003254) (SEQ ID NO:70), MUC1 
(XM_053256) (SEQ ID NO:72), CEA (NM_004363) (SEQ ID NO:74), NCA (NM_002483) 
(SEQ ID NO:76), Follistatin (NM_006350) (SEQ ID NO:78), Claudin 1 (NM_021101) (SEQ 
ID NO:80), Claudin 14 (NM_012130) (SEQ ID NO:82), tenascin-R (NM_003285) (SEQ ID 
NO:84), CAD3 (NM_001793) (SEQ ID NO:86), AXOl (NM_005076) (SEQ ID NO: 10), 

20 CONT (NM_001843) (SEQ ID NO:88), Osteopontin (NM_000582) (SEQ ID NO:90), 
Galectin 8 (NM_006499) (SEQ ID NO:92), PGS1 (bihlycan, NM001711) (SEQ ID NO:94), 
Frizzled 2 (NM 001466) (SEQ ID NO:96), ISLR (NM_005545) (SEQ ID NO:98), FLJ23399 
(NM_022763) (SEQ ID NO: 100), TEM1 (NM_020404) (SEQ ID NO: 102), Tie2 ligand2 
(NM_001147) (SEQ ID NO: 104), STC-2 (NM_003714) (SEQ ID NO:20), VEGFC 

25 (NM_005429) (SEQ ID NO: 106), tPA (NM_000930) (SEQ ID NO: 108), Endothelin 1 
(NM_001955) (SEQ ID NO:2), Thrombomodulin (NM_000361) (SEQ ID NO: 110), TF 
(NM_001993) (SEQ ID NO:112), GPR4 (NM_005282) (SEQ ID NO: 114), GPR66 
(NM_006056) (SEQ ID NO: 116), SLC22A2 (NM_003058) ((SEQ ID NO: 11 8), MLSN1 
(NM_002420) (SEQ ID NO: 120), or ATN2 (Na/K transport, NM 000702) (SEQ ID NO: 122). 

30 The antagonist is a small molecule that binds and inactivates the polypeptide; binds and 
inactivates a precursor of the polypeptide; prevents translation of the polypeptide; prevents its 
transcription; or the like. Alternatively, the antagonist is an antibody that specifically binds 
the polypeptide and inhibits or prevents its activity. Where the antagonist is an antibody, the 
antibody is optionally a monoclonal antibody, a humanized antibody, or a binding fragment 
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thereof. The treatment involves contacting a cancer cell with an antagonist of at least one of 
the polypeptides encoded by the adenocarcinoma marker genes listed above, alternatively with 
an antagonist of at least three, alternatively with at least five, and alternatively with at least 
eight of the polypeptides encoded by the adenocarcinoma marker genes listed above. 

Further, a method of screening for a compound that inhibits cancer cell growth or 
causes the death of a cancer cell, particularly an adenocarcinoma cell, an esophageal 
adenocarcinoma cell, or a colon cancer cell, is an aspect of the invention. Accordingly, the 
screening method involves contacting a cancer cell, such as one expressing at least one, three, 
five, eight or more of the adenocarcinoma gene markers selected from the group consisiting of 
CAD17 Giver-intestine cadherin, NM_004063) (SEQ ID NO:45), CLDN15 (claudin 15, 
NM_014343) (SEQ ID NO:47), SLNAC1 (sodium channel, NM_004769) (SEQ ID NO:23), 
CFTR (chloride channel, NM_000492) (SEQ ID NO:49), H2R (histamine H2 receptor, 
NM 022304) (SEQ ID NO:51), PRSS8 (serine protease, NM_002773) (SEQ ID NO:7), PA21 
(phospholipase A2 group IB, NM_000928) (SEQ ID NO:27), AGR2 (anterior gradient 2 
homolog, (NM.006408) (SEQ ID NO:3), EGFR (NM_005228) (SEQ ID NO:53), EPHB2 
(NM_004442) (SEQ ID NO:55), CRIPTO CR-1 (NM_003212) (SEQ ID NO:57), Eprin Bl 
(NM_004429) (SEQ ID NO:59), MMP- 1 7/MT4-MMP (NM_016155) (SEQ ID NO:61), 
MMP26 (NM_021801) (SEQ ID NO:63), AD AMI 0 (NMJXttllO) (SEQ ID NO:65), 
ADAMS (NM_001109) (SEQ ID NO:5), ADAM1 (XM_132370) (SEQ ID NO:67), TBVI1 
(NM_003254) (SEQ ID NO:69), MUC1 (XM 053256) (SEQ ID NO:71), CEA (NM_004363) 
(SEQ ID NO:73), NCA (NM_002483) (SEQ ID NO:75), Follistatin (NM_006350) (SEQ ID 
NO:77), Claudin 1 (NM.021 101) (SEQ ID NO:79), Claudin 14 (NM012130) (SEQ ID 
NO:81), tenascin-R (NM.003285) (SEQ ID NO:83), CAD3 (NM_001793) (SEQ ID NO:85), 
AXOl (NM_005076) (SEQ ID NO:9), CONT (NM_001843) (SEQ ID NO:87), Osteopontin 
(NM_000582) (SEQ ID NO:89), Galectin 8 (NM_006499) (SEQ ID NO:91), PGS1 (bihlycan, 
NM_001711) (SEQ ID NO:93), Frizzled 2 (NM_001466) (SEQ ID NO:95), ISLR 
(NM_005545) (SEQ ID NO:97), FLJ23399 (NM_022763) (SEQ ID NO:99), TEM1 
(NM 020404) (SEQ ID NO: 101), Tie2 ligand2 (NM_001147) (SEQ ID NO: 103), STC-2 
(NM_003714) (SEQ ID NO: 19), VEGFC (NM_005429) (SEQ ID NO: 105), tPA 
(NM_000930) (SEQ ID NO: 107), Endothelin 1 (NM_001955) (SEQ ID NO:l), 
Thrombomodulin (NM_000361) (SEQ ID NO:109), TF (NM_001993) (SEQ ID NO:lll), 
GPR4 (NM_005282) (SEQ ID NO: 113), GPR66 (NM_006O56) (SEQ ID NO: 115), SLC22A2 
(NM_003058) ((SEQ ID NO: 117), MLSN1 (NM_002420) (SEQ ID NO: 119), and ATN2 
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(Na/K transport, NM_000702) (SEQ ID NO: 121), followed by determining cancer cell growth 
inhibition or cancer cell death. 

Example 5: Nucleic acid and amino acid sequence identity determinations : 

5 

As shown below, Table 5 provides the complete source code for the ALIGN-2 
sequence comparison computer program. This source code may be routinely compiled for use 
on a UNIX operating system to provide the ALIGN-2 sequence comparison computer 
program. 

10 

In addition, disclosed herein are hypothetical exemplifications for using the below 
described method to determine % amino acid sequence identity and % nucleic acid sequence 
identity using the ALIGN-2 sequence comparison computer program, wherein "PRO" 
represents the amino acid sequence of a hypothetical HGD marker polypeptide of interest, 

15 "Comparison Protein" represents the amino acid sequence of a polypeptide against which the 
"PRO" polypeptide of interest is being compared, "PRO-DNA" represents a hypothetical 
HGD marker polypeptide-encoding nucleic acid sequence of interest, "Comparison DNA" 
represents the nucleotide sequence of a nucleic acid molecule against which the "PRO-DNA" 
nucleic acid molecule of interest is being compared, "X", "Y", and "Z" each represent 

20 different hypothetical amino acid residues and "N", "L" and "V" each represent different 
hypothetical nucleotides. 

Table 5 

/* 

25 * 

* C-C increased from 12 to 15 

* Z is average of EQ 

* B is average of ND 

* match with stop is JV1; stop-stop = 0; J (joker) match = 0 
30 */ 

#define „M -8 /* value of a match with a stop */ 

int _day[26][26] = { 

/* ABCDEFGHIJKLMNOPQRSTUVWXYZ*/ 
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/* A */ { 2, 0,-2, 0, 0,-4, 1,-1,-1, 0,-1,-2,-1, 0,_M, 1, 0,-2, 1, 1, 0, 0,-6, 0,-3, 0}, 

/* B */ { 0, 3,-4, 3, 2,-5, 0, 1,-2, 0, 0,-3,-2, 2,_M,-1, 1, 0, 0, 0, 0,-2,-5, 0,-3, 1}, 

/* C */ {-2,-4,15,-5,-5,-4,-3,-3,-2, 0,-5,-6,-5,-4,_M,-3,-5,-4, 0,-2, 0,-2,-8, 0, 0,-5}, 

/* D */ { 0, 3,-5, 4, 3,-6, 1, 1,-2, 0, 0,-4,-3, 2,_M,-1, 2,-1, 0, 0, 0,-2,-7, 0,-4, 2}, 

5 /* E */ { 0, 2,-5, 3, 4,-5, 0, 1,-2, 0, 0,-3,-2, 1,_M,-1, 2,-1, 0, 0, 0,-2,-7, 0,-4, 3}, 

/* F */ {-4,-5,-4,-6,-5, 9,-5,-2, 1, 0,-5, 2, 0,-4,_M,-5,-5,-4,-3,-3, 0,-1, 0, 0, 7,-5}, 

/* G */ { 1, 0,-3, 1, 0,-5, 5,-2,-3, 0,-2,-4,-3, 0,_M,-l,-l,-3, 1, 0, 0,-1,-7, 0,-5, 0}, 

/* H */ {-1, 1,-3, 1, 1,-2,-2, 6,-2, 0, 0,-2,-2, 2,_M, 0, 3, 2,-1,-1, 0,-2,-3, 0, 0, 2}, 

/* I */ {-1,-2,-2,-2,-2, 1,-3,-2, 5, 0,-2, 2, 2,-2,_M,-2,-2,-2,-l, 0, 0, 4,-5, 0,-1,-2}, 

10 1*1*1 { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,_M, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0}, 

/* K */ {-1, 0,-5, 0, 0,-5,-2, 0,-2, 0, 5,-3, 0, 1,_M,-1, 1, 3, 0, 0, 0,-2,-3, 0,-4, 0}, 

/* L */ {-2,-3,-6,-4,-3, 2,-4,-2, 2, 0,-3, 6, 4,-3,_M,-3,-2,-3,-3,-l, 0, 2,-2, 0,-1,-2}, 

/* M */ {-1,-2,-5,-3,-2, 0,-3,-2, 2, 0, 0, 4, 6,-2,_M,-2,-l, 0,-2,-1, 0, 2,-4, 0,-2,-1}, 

/* N */ { 0, 2,-4, 2, 1,-4, 0, 2,-2, 0, 1,-3,-2, 2,_M,-1, 1, 0, 1, 0, 0,-2,-4, 0,-2, 1}, 

15 /* O */ {_M,_M,_M,_M,_M,_M,_M,_M,_M,_M,_M,_M,_M,_M, 
0,_M,_M,_M,_M,_M,_M,_M,_M,_M,_M,_M}, 

/* P */ { 1,-1,-3,-1,-1,-5,-1, 0,-2, 0,-1, -3,-2,-1, _M, 6, 0, 0, 1, 0, 0,-1,-6, 0,-5, 0}, 

/* Q */ { 0, 1,-5, 2, 2,-5,-1, 3,-2, 0, 1,-2,-1, 1,_M, 0, 4, 1,-1,-1, 0,-2,-5, 0,-4, 3}, 

/* R */ {-2, 0,-4,-1,-1,-4,-3, 2,-2, 0, 3,-3, 0, 0,_M, 0, 1, 6, 0,-1, 0,-2, 2, 0,-4, 0}, 

20 /* S */ { 1, 0, 0, 0, 0,-3, 1,-1,-1, 0, 0,-3,-2, 1,_M, 1,-1, 0, 2, 1, 0,-1,-2, 0,-3, 0}, 

/* T */ { 1, 0,-2, 0, 0,-3, 0,-1, 0, 0, 0,-1,-1, 0,_M, 0,-1,-1, 1, 3, 0, 0,-5, 0,-3, 0}, 

/* U */ { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,_M, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0}, 

/* V */ { 0,-2,-2,-2,-2,-1,-1,-2, 4, 0,-2, 2, 2,-2,_M,- 1,-2,-2,-1, 0, 0, 4,-6, 0,-2,-2}, 

/* W */ {-6,-5,-8,-7,-7, 0,-7,-3,-5, 0,-3,-2,-4,-4,_M,-6,-5, 2,-2,-5, 0,-6,17, 0, 0,-6}, 

25 /* X */ { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,_M, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0}, 

/* Y */ {-3,-3, 0,-4,-4, 7,-5, 0,-1, 0,-4,-l,-2,-2,_M,-5,-4,-4,-3,-3, 0,-2, 0, 0,10,-4}, 

/* Z */ { 0, 1,-5, 2, 3,-5, 0, 2,-2, 0, 0,-2,-1, 1,_M, 0, 3, 0, 0, 0, 0,-2,-6, 0,-4, 4} 

}; 
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Page 1 of day.h 

/* 
*/ 

#include <stdio.h> 
#include <ctype.h> 



#deflne MAXJMP 16 /* max jumps in a diag */ 

#define MAXGAP 24 /* don't continue to penalize gaps larger than this */ 

#de£ine JMPS 1024 /* max jmps in an path */ 

#define MX 4 /* save if there's at least MX-1 bases since last jmp */ 



#define 


DMAT 


3 


/* value of matching bases */ 


#define 


DMIS 


0 


/* penalty for mismatched bases */ 


#define 


DINSO 


8 


/* penalty for a gap */ 


#de£ine 


DINS1 


1 


/* penalty per base */ 


#define 


PINSO 


8 


/* penalty for a gap */ 


#de£ine 


PINS1 


4 


/* penalty per residue */ 



struct jmp { 

short n [MAXJMP]; /* size of jmp (neg for dely) */ 

unsigned short x [MAX JMP]; /* base no. of jmp in seq x */ 
}; /* limits seq to 2 A 16 -1 */ 



struct diag { 

int score; /* score at last jmp */ 
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long 


offset; 


/* offset of nrev block */ 


short 


ijmp; 


/* current jmp index */ 


struct jmp jp; 

}; 


/* list of imns */ 


struct nath i 






int 

mi 




/ llU.lI.lUd VJJL IC/CLU-lilg, oUdLCS / 


sli nrt 


nFTMPSV 


/* size of jmp (gap) */ 


int 
lilt 

h 


ALJ1VXJT OJ , 


/ 1UU Ol JIlip ^id-bl C1G111 UCJaJJLC gdjj^ / 




WlllV/ «, 


/H» /"vnfm'iT "f"i 1 o ncur'io / 
/ UULptlL 111C IldZIlC / 




^nanriPY f^l * 

li&lllt^A. \_-£*J 9 


i seq names, getseqsy / 


didl 


prog, 


/ ^ prog name ior err nisgs / 


char 




/ cisr±/~tc-\* rraf fi A/in 1 \ <~ f 

1 seqs. getseqsij / 


mi 


QIllaA, 




int 
nit. 


L1111CLA.W , 


/ 1111 dl KXLCLg) 1 


int 




/ SGI 11 Ulld.. Illdlll^ / 


int 

1111 


pnHcrpms* 


/* set if npnaliyinp pud 


int 


gapx, gapy; 


/ total gaps in seqs / 


int 


lenO, lenl; 


con Iaho / 


int 


ngapx, ngapy; 


/H* -fr^rol ci ^o r\"F crdfic ^/ 
/ lUlcU. I>±Z,C Ul g<lUt!> / 


int 


smax; 


/ iiidA. oL/Uit/. ii w yj i 


int 


*xbm; 


/* bitmap for matching */ 


long 


offset; 


/* current offset in jmp file */ 


struct diag 


*dx; 


/* holds diagonals */ 


struct path 


pp[2]; 


/* holds path for seqs */ 



char *calloc(), *malloc(), *index(), *strcpy(); 

char *getseq(), *g_calloc(); 
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/* Needleman-Wunsch alignment program 

* 

* usage: progs filel file2 

5 * where filel and file2 are two dna or two protein sequences. 

* The sequences can be in upper- or lower-case an may contain ambiguity 

* Any lines beginning with ';*» ! > ! or f < ! are ignored 

* Max file length is 65535 (limited by unsigned short x in the jmp struct) 

* A sequence with 1/3 or more of its elements ACGTU is assumed to be DNA 
10 * Output is in the file "align.out" 

* The program may create a tmp file in /tmp to hold info about traceback. 

* Original version developed under BSD 4.3 on a vax 8650 

*/ 

15 #include "nw.h" 
#include "day.h" 

static _dbval[26] = { 

1,14,2 5 13,0 5 0AH 5 0A12A3 ? 15,0,0A5 9 6,8,8 ? 7,9A10,0 

20 }; 

static _pbval[26] = { 

1, 2|(l«CD , - f A , ))l(l«( , N T - , A , )) 9 4, 8, 16, 32, 64, 
128, 256, OxFFFFFFF, 1«10, 1«11, 1«12, 1«13, 1«14, 
25 1«15, 1«16, 1«17, 1«18, 1«19, 1«20, 1«21, 1«22, 

1«23, 1«24, 1 «25 1( 1 «('E t -'A')) |( 1 «('Q A f )) 

}; 

main(ac, av) main 
30 int ac; 

char *av[]; 

{ 

prog = av[0]; 
if (ac != 3){ 
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fprintf(stderr,"usage: %s filel file2\n", prog); 

fprintf(stderr,"where filel and file2 are two dna or two protein sequences.W); 
fprintf(stderr 5 M The sequences can be in upper- or lower~case\n"); 
fprintf(stderr," Any lines beginning with V or '<* are ignored\n"); 
fprintf(stderr,"Output is in the file \"align.out\"\n"); 
exit(l); 

} 

namex[0] = av[l]; 

namex[l] = av[2]; 

seqx[0] = getseq(namex[0], &len0); 

seqx[l] = getseq(namex[l], &lenl); 

xbm = (dna)? _dbval : _pbval; 



endgaps = 0; /* 1 to penalize endgaps */ 

ofile = "align-out"; /* output file */ 

nw(); /* fill in the matrix, get the possible jmps */ 

readjmps(); /* get the actual jmps */ 
print(); /* print stats, alignment */ 

cleanup(O); /* unlink any tmp files */ 

Page 1 of nw.c 
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/* do the alignment, return best score: mainQ 

* dna: values in Fitch and Smith, PNAS, 80, 1382-1386, 1983 

* pro: PAM 250 values 

* When scores are equal, we prefer mismatches to any gap, prefer 

* a new gap to extending an ongoing gap, and prefer a gap in seqx 

* to a gap in seq y. 
*/ 

nw() 
{ 



char 


*px, *py; 


/* seqs and ptrs */ 


int 


*ndely, *dely; /* keep track of dely */ 


int 


ndelx, delx; 


/* keep track of delx */ 


int 


*tmp; 


/* for swapping rowO, rowl */ 


int 


mis; 


/* score for each type */ 


int 


insO, insl; 


/* insertion penalties */ 


register 


id; 


/* diagonal index */ 


register 


ij; 


/*jmp index */ 


register 


*col0, 


*coll ; /* score for curr, last row 


register 


xx, yy; /* index into seqs */ 



dx = (struct diag *)g_calloc("to get diags", len0+lenl+l, sizeof(struct diag)); 

ndely = (int *)g_calloc("to get ndely", lenl+1, sizeof(int)); 
dely = (int *)g_calloc("to get dely", lenl+1, sizeof(int)); 
colO = (int *)g_calloc("to get colO", lenl+1, sizeof(int)); 
coll = (int *)g_calloc("to get coll", lenl+1, sizeof(int)); 
insO = (dna)? DINS0 : PINS0; 
insl = (dna)? DINS1 : PINS1; 

smax = -10000; 
if (endgaps) { 

for (col0[0] = dely[0] = -insO, yy = 1; yy <= lenl; yy++) { 
col0[yy] = dely[yy] = col0[yy-l] - insl; 
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ndely[yy] = yy; 

} 

colO[0] = 0; /* Waterman Bull Math Biol 84 */ 

} 

5 else 

for (yy = 1; yy <= lenl; yy++) 
dely[yy] = -insO; 



/* fill in match matrix 
10 */ 

for (px = seqx[0], xx = 1; xx <= lenO; px++, xx++) { 
/* initialize first entry in col 
*/ 

if (endgaps) { 
15 if(xx==l) 

coll[0] = delx = -(insO+insl); 

else 

coll[0] = delx = col0[0] - insl; 
ndelx = xx; 

20 } 

else { 

coll[0]=0; 
delx = -insO; 
ndelx = 0; 

25 } 
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for (py = seqx[l], yy = 1; yy <= lenl; py++, yy++) { 
mis = col0[yy-l]; 
if (dna) 

mis += (xbm[*px- , A , ]&xbm[*py-'A'])? DMAT : DMIS; 

else 

mis += _day[*px-A*][*py-A']; 

/* update penalty for del in x seq; 

* favor new del over ongong del 

* ignore MAXGAP if weighting endgaps 
*/ 

if (endgaps || ndely[yy] < MAXGAP) { 
if (colO[yy] - insO >= dely[yy]) { 

dely[yy] = colO[yy] - (insO+insl); 
ndely[yy] = 1; 

} else { 

dely[yy] -= insl; 
ndely[yy]++; 

} 

} else { 

if (col0[yy] - (insO+insl) >= dely[yy]) { 
dely[yy] = col0[yy] - (insO+insl); 
ndely[yy] = 1; 

} else 

ndely[yy]++; 

} 

/* update penalty for del in y seq; 
* favor new del over ongong del 
*/ 

if (endgaps || ndelx < MAXGAP) { 

if (coll[yy-l] - insO >= delx) { 

106 



WO 2004/044178 



PCT/US2003/036260 



delx = coll[yy-l] - (insO+insl); 
ndelx = 1; 

} else { 

delx -= insl; 

5 ndelx++; 

} 

} else { 

if (coll[yy-l] - (insO+insl) >= delx) { 
delx = coll[yy-l] - (insO+insl); 
10 ndelx = 1; 

} else 

ndelx++; 

> 

15 /* pick the maximum score; we're favoring 

* mis over any del and delx over dely 
*/ 



20 



25 
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id = xx - yy + lenl - 1; 

if (mis >= delx && mis >= dely[yy]) 

coll[yy] = mis; 
else if (delx >= delyfyy]) { 

coll[yy] = delx; 

ij = dx[id]ijmp; 

if (dx[id].jp.n[0] && (!dna || (ndelx >= MAXJMP 
&& xx > dx[id].jp.x[ij]+MX) || mis > dx[id].score+DINSO)) { 
dx[id].ijmp++; 
if (++ij>= MAXJMP) { 
writejmps(id); 
ij = dx[id]ijmp = 0; 
dx[id].offset = offset; 

offset += sizeof (struct jmp) + sizeof(offset); 

} 

} 

dx[id].jp.n[ij] = ndelx; 
dx[id].jp.x[ij]=xx; 
dx[id]. score = delx; 

} 

else { 

coll[yy] =dely[yy]; 
ij = dx[id].ijmp; 



if (dx[id].jp.n[0] && (!dna || (ndely[yy] >= MAXJMP 

&& xx > dx[id].jp.x[ij]+MX) || mis > dx[id].score+DINSO)) { 
dx[id]ijmp++; 
if (++ij>= MAXJMP) { 
writejmps(id); 
ij = dx[id].ijmp = 0; 
dx[id].offset = offset; 

offset += sizeof (struct jmp) + sizeof(offset); 

} 
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} 

dx[id].jp.n[ij] = -ndely[yy]; 
dx[id].jp.x[ij] =xx; 
dx[id]. score = dely[yy]; 

} 

if (xx == lenO && yy < lenl) { 
/* last col 
*/ 

if (endgaps) 

coll[yy] -= insO+insl*(lenl-yy); 
if (coll[yy] > smax) { 

smax = coll[yy]; 

dmax = id; 

} 

} 

} 

if (endgaps && xx < lenO) 

col 1 [yy- 1 ] -= insO+ins 1 * (lenO-xx) ; 
if (coll[yy-l] > smax) { 

smax = coll[yy-l]; 

dmax = id; 

tmp = colO; colO = coll; coll = tmp; 

} 

(void) free((char *)ndely); 
(void) free((char *)dely); 
(void) free((char *)col0); 
(void) free((char *)coll); 
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/* 

* print() — only routine visible outside this module 
5 * 

* static: 

* getmatQ - trace back best path, count matches: printQ 

* pr_align() - print alignment of described in array p[]: print() 

* dumpblockQ dump a block of lines with numbers, stars: pr_align() 
10 * numsQ — put out a number line: dumpblockQ 

* putlineQ — put out a line (name, [num], seq, [num]): dumpblock() 

* starsQ - -put a line of stars: dumpblock() 

* stripnameQ — strip any path and prefix from a seqname 
*/ 

15 

#include "nw.h" 



#defineSPC 3 

#define P_LINE 256 /* maximum output line */ 
20 #define P_SPC 3 /* space between name or num and seq */ 



extern _day [26] [26]; 

int olen; /* set output line length */ 

FILE *fx; /* output file */ 

25 

printQ print 
{ 

int lx, ly, firstgap, lastgap; /* overlap */ 



30 if ((fx = fopen(ofile, V)) == 0) { 

fprintf(stderr, M %s: can't write %s\n", prog, ofile); 
cleanup(l); 

} 

fprintf(fx, M <first sequence: %s (length = %d)\n H , namex[0], lenO); 
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fprintf(fx, Vsecond sequence: %s (length = %d)\n", namex[l], lenl); 
olen = 60; 
lx = lenO; 
ly = lenl; 

firstgap = lastgap = 0; 

if (dmax < lenl - 1) { /* leading gap in x */ 
pp[0].spc = firstgap = lenl - dmax - 1; 
ly "= pp[0].spc; 

} 

else if (dmax > lenl - 1) { /* leading gap in y */ 
pp[l].spc = firstgap = dmax - (lenl - 1); 
lx -= pp[l].spc; 

} 

if (dmaxO < lenO - 1) { /* trailing gap in x */ 

lastgap = lenO - dmaxO -1; 
lx -= lastgap; 

} 

else if (dmaxO > lenO - 1) { /* trailing gap in y */ 
lastgap = dmaxO - (lenO - 1); 
ly -= lastgap; 

} 

getmat(lx, ly, firstgap, lastgap); 
pr_align(); 

} 
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/* 

* trace back the best path, count matches 
*/ 

5 static 

getmat(lx, ly, firstgap, lastgap) getmat 
int lx, ly; /* "core" (minus endgaps) */ 

int firstgap, lastgap; /* leading trailing overlap */ 

{ 

1 o int nm, iO, i 1 , sizO, siz 1 ; 

char outx[32]; 
double pet; 
register nO, nl; 

register char *p0, *pl; 

15 

/* get total matches, score 
*/ 

iO = il = sizO = sizl = 0; 
pO = seqx[0] + pp[l].spc; 
20 pi = seqx[l] + pp[0].spc; 

nO = pp[l].spc+ 1; 
nl =pp[0].spc + 1; 

nm = 0; 

25 while ( *p0 && *pl ) { 

if (siz0){ 

pl++; 
nl++; 
sizO— ; 

30 } 

else if (sizl) { 
p0++; 
n0++; 
sizl--; 
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} 

else { 

if (xbmt^pO-WJ&xbmt^pl-W]) 

nm++; 
if (n0++ = pp[0].x[iO]) 

sizO = pp[0].n[iO++]; 
if (nl++ == pp[l].x[il]) 

sizl = pp[l].n[il++]; 

p0++; 
pl++; 

} 

} 

/* pet homology: 

* if penalizing endgaps, base is the shorter seq 

* else, knock off overhangs and take shorter core 
*/ 

if (endgaps) 

lx = (lenO < lenl)? lenO : lenl; 

else 

lx = (be < ly)? lx : ly; 
pet = 100.*(double)nm/(double)lx; 

fprintf(fx 5 "\n"); 

fprintf(fx, "<%d match%s in an overlap of %d: %.2f percent similarity\n ,f , 
nm, (nm == 1)? : "es", lx, pet); 



Page 2 of nwprint.c 



113 



WO 2004/044178 



PCT/US2003/036260 



fprintf(fx, "<gaps in first sequence: %d", gapx); 



getmat 



if (gapx) { 

(void) sprintf(outx, " (%d %s%s)", 

ngapx, (dna)? "base": "residue", (ngapx == 1)? "Vs"); 
fprintf(fx,"%s", outx); 

fprintf(fx, gaps in second sequence: %d", gapy); 
if (gapy) { 

(void) sprintf(outx, " (%d %s%s) ,f , 

ngapy, (dna)? "base": "residue", (ngapy == 1)? "":"s"); 

fprintf(fx,"%s", outx); 



if (dna) 

fprintf(fx, 

"\n<score: %d (match = %d, mismatch = %d, gap penalty = %d + %d per 



smax, DMAT, DMIS 5 DINSO, DINS1); 

else 

fprintf(fx ? 

M \n<score: %d (Dayhoff PAM 250 matrix, gap penalty = %d + %d per 



smax, PINSO, PINS1); 
if (endgaps) 

fprintf(fx, 

Vendgaps penalized, left endgap: %d %s%s, right endgap: %d %s%s\n", 
firstgap, (dna)? "base" : "residue", (firstgap == 1)? "" : "s", 
lastgap, (dna)? "base" : "residue", (lastgap == 1)? "" : "s"); 

else 

fprintf(fx, "<endgaps not penalized\n"); 



} 



base)\n", 



residue)\n", 



} 



static 



nm; 



/* matches in core — for checking */ 
/* lengths of stripped file names */ 



static 



lmax; 
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static 
static 
static 
static 
static char 
static char 
static char 



ij[2]; 

nc[2]; 

nip]; 

siz[2]; 

*ps[2]; 

*po[2]; 



/* jmp index for a path */ 

/* number at start of current line */ 

/* current elem number - for gapping */ 

/* ptr to current element */ 

/* ptr to next output char slot */ 



out[2] [P JLINE] ; /* output line */ 



static char star[P_LINE]; /* set by starsQ */ 



/* 

* print alignment of described in struct path pp[] 

*/ 

static 

pr„align() pr_align 
{ 

int nn; /* char count */ 

int more; 
register i; 



for (i = 0, lmax = 0; i < 2; i++) { 
nn = stripname(namex[i]); 
if (nn > lmax) 

lmax = nn; 



nc[i] = 1; 
ni[i] = 1; 
siz[i] = ij[i] = 0; 
ps[i] = seqx[i]; 
po[i] = out[i]; 

} 
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for (nn = nm = 0, more = 1 ; more; ) { ...pr_align 
for (i = more = 0; i < 2; i++) { 

/* 

* do we have more of this sequence? 
*/ 

if (!*ps[i]) 

continue; 

more++; 

if (pp[i].spc) { /* leading space */ 
*po[i]++="; 
pp[i].spc-; 

} 

else if (siz[i]) { /* in a gap */ 

*po[i]++ = , -' ; 
siz[i]-; 

} 

else { /* we're putting a seq element 

*/ 

*po[i] = *ps[i]; 
if (islower(*ps[i])) 

*ps[i] = toupper(*ps[i]); 
po[i]++; 
ps[i]++; 

/* 

* are we at next gap for this seq? 
*/ 

if(ni[i] =pp[i].x[ij[i]]){ 

/* 

* we need to merge all gaps 

* at this location 
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*/ 

siz[i] = pp[i].n[ij[i]++]; 
while (ni[i] = pp[i].x[ij[i]]) 

siz[i] += pp[i].n[ij[i]++]; 



5 } 

ni[i]++; 

} 

} 

if (++nn == olen || !more && nn) { 
10 dumpblock(); 

for (i = 0; i < 2; i++) 
po[i] = out[i]; 

nn = 0; 

} 

15 } 



/* 

* dump a block of lines, including numbers, stars: pr_align() 
20 */ 
static 

dumpblock() dumpblock 
{ 

register i; 

25 

for (i = 0; i < 2; i++) 
*po[i]- = '\0'; 
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.dumpblock 



(void) putc('\n\ fx); 
for (i = 0; i < 2; i++) { 

if (*out[i] && (*out[i] != ' 1 1| *(po[i]) != ' ')) { 
if (i = 0) 

nums(i); 
if (i==0&& *out[l]) 

stars(); 
putline(i); 

if (i ==0&& *out[l]) 

fprintf(fx, star); 
if(i= 1) 

nums(i); 

} 

} 



/* 

* put out a number line: dumpblock() 

*/ 

static 

nums(ix) 

int ix; /* index in out[] holding seq line */ 



{ 



char nline[P_LINE]; 
register i,j; 
register char *pn, *px, *py; 



nums 



for (pn = nline, i = 0; i < lmax+PJSPC; i++, pn++) 



*pn = ' 



for (i = nc[ix], py = out[ix]; *py; py++, pn++) { 
if (*py=="ll *PY= -O 
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*pn = ' 

else { 

if (i%10 == 0 || (i == 1 && nc[ix] != 1)) { 
j = (i<0)?-i:i; 
for (px = pn; j; j /= 10, px-) 

*px=j%10 + *0'; 
if (i < 0) 

*px = 

} 

else 

*pn = "; 

i++; 

} 

} 

*pn = '\0'; 
nc[ix] = i; 

for (pn = nline; *pn; pn++) 
(void) putc(*pn, fx); 
(void) putc('\n*, fx); 

} 

/* 

* put out a line (name, [num], seq, [num]): dumpblock() 

*/ 

static 

putline(ix) 

int ix; 

{ 
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putline 
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...putline 

int i; 
register char *px; 



for (px = namex[ix], i = 0; *px && *px != V; px++, i++) 

(void) putc(*px, fx); 
for (; i < lmax+PJSPC; i++) 
10 (void) putcC \ fx); 



/* these count from 1: 

* ni[] is current element (from 1) 

* nc[] is number at start of current line 
15 */ 

for (px = out[ix]; *px; px++) 

(void) putc(*px&0x7F, fx); 
(void) putc('\n\ fx); 



/* 

* put a line of stars (seqs always in out[0], out[l]): dumpblockQ 

*/ 

25 static 

starsO stars 
{ 

int i; 

register char *p0, *pl, cx, *px; 



30 



if (!*out[0] || (*out[0] == 1 ' && *(po[0]) == ' ') || 
!*out[l] || (*out[l] — 1 ' && *(po[l]) == ' ')) 
return; 

px = star; 
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for (i = lmax+P_SPC; i; i~) 
*px++ = ' '; 

for (pO = out[0], pi = out[l]; *p0 && *pl; p0++, pl++) { 
if (isalpha(*pO) && isalpha(*pl)) { 

if (xbm[*pO-'A']&xbm[*pl-'A']) { 
cx = **'; 
nm++; 

} 

else if (!dna && _day[*pO-'A'][*pl-'A'] > 0) 
cx = 

else 

cx = "; 

} 

else 

cx = ' *; 
*px++ = cx; 

} 

*px++ = '\n'; 
*px = '\0'; 
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/* 

* strip path or prefix from pn, return len: pr_align() 

*/ 

static 

stripname(pn) stripname 
char *pn; /* file name (may be path) */ 

{ 

register char *px, *py; 
py = 0; 

for (px = pn; *px; px++) 
if (* px ==V) 

py = px+ 1; 

if(PY) 

(void) strcpy(pn, py); 
return(strlen(pn)) ; 

} 
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/* 

* cleanupO — cleanup any tmp file 

* getseqO — read in seq, set dna, len, maxlen 
5 * g_calloc() ~ calloc() with error checkin 

* readjmps() ~ get the good jmps, from tmp file if necessary 

* writejmps() — write a filled array of jmps to a tmp file: nwQ 
*/ 

#include "nw.h" 
10 #include <sys/file.h> 

char *jname = ,t /tmp/homgXXXXXX !, ; /* tmp file for jmps */ 

FILE *fj; 

15 int cleanupQ; /* cleanup tmp file */ 

long lseek(); 

/* 

* remove any tmp file if we blow 
20 */ 

cleanup(i) cleanup 
int i; 

{ 

if(fj) 

25 (void) unlink(jname); 

exit(i); 

> 

/* 

30 * read, return ptr to seq, set dna, len, maxlen 

* skip lines starting with '<', or V 

* seq in upper or lower case 
*/ 

char * 
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getseq(file, len) 

char *file; /* file name */ 

int *len; /* seq len */ 

{ 

char line[1024], *pseq; 

register char *px, *py; 

int natgc, tlen; 

FILE *fp; 

if ((fp = fopen(file,"r")) == 0) { 

fprintf(stderr,"%s: can't read %s\n", prog, file); 
exit(l); 

} 

tlen = natgc = 0; 

while (fgets(line, 1024, fp)) { 

if (*line == V || *line == '<' || *line == V) 

continue; 
for (px = line; *px != '\n'; px++) 

if (isupper(*px) || islower(*px)) 
tlen++; 

} 

if ((pseq = malloc((unsigned)(tlen+6))) == 0) { 

fprintf(stderr,"%s: malloc() failed to get %d bytes for %s\n", prog, tlen+6, file); 
exit(l); 

} 

pseq[0] = pseq[l] = pseq[2] = pseq [3] = ! \0'; 
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getseq 



py = pseq + 4; 
*len = tlen; 
rewind(fp); 

while (fgets(line, 1024, fp)) { 

if (*line = V || *line == '<' || *line = '>') 

continue; 
for (px = line; *px != '\n ! ; px++) { 
if (isupper(*px)) 

*py++ = *px; 
else if (islower(*px)) 

*py++ = toupper(*px); 
if (index( ,f ATGCU^^Cpy-l))) 
natgc++; 

} 

} 

*py++ = '\0'; 
*py = '\0 , ; 
(void) fclose(fp); 
dna = natgc > (tlen/3); 
return(pseq+4); 



char * 



g_calloc(msg, nx 5 sz) 



g_calloc 



int 



char *: 



nx, sz; 



: msg; 



/* program, calling routine */ 
/* number and size of elements */ 



char 



*px, *calloc(); 



if ((px = calloc((unsigned)nx, (unsigned)sz)) == 0) { 

if (*msg) { 
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fprintf(stderr, "%s: g_calloc() failed %s (n=%d, sz=%d)\n", prog, msg, 

nx, sz); 

exit(l); 

} 

} 

return(px); 



/* 

10 * get final jmps from dx[] or tmp file, set pp[], reset dmax: main() 

*/ 

readjmps() readjmps 
{ 

int fd = -l; 

15 int siz, iO, il; 

register i,j,xx; 

if(fj){ 

(void) fclose(fj); 

20 if ((fd = open(jname ? 0_RDONLY, 0)) < 0) { 

fprintf(stderr, "%s: can't open() %s\n M , prog, jname); 
cleanup(l); 

25 for (i = iO = il = 0, dmaxO = dmax, xx = lenO; ; i++) { 

while (1) { 

for (j = dx[dmax].ijmp; j >= 0 && dx[dmax].jp.x[j] >= xx; j— ) 
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readjmps 

if (j < 0 && dx[dmax] .offset && fj) { 

(void) lseek(fd, dx[dmax] .offset, 0); 

(void) read(fd, (char *)&dx[dmax] jp, sizeof (struct jmp)); 
5 (void) read(fd, (char *)&dx[dmax]. offset, 

sizeof(dx[dmax] .offset)); 

dx[dmax].ijmp = MAXJMP-1; 

} 

else 

10 break; 
} 

if (i >= JMPS) { 

fprintf(stderr, "%s: too many gaps in alignment^", prog); 
cleanup(l); 

15 } 

if(j>=0){ 

siz = dx[dmax].jp.n[j]; 
xx = dx[dmax].jp.x[j]; 
dmax += siz; 

20 if (siz < 0) { /* gap in second seq */ 

pp[l].n[il] = -siz; 
xx += siz; 

/* id = xx - yy + lenl - 1 

25 */ 

pp[l].x[il] = xx - dmax + lenl - 1; 

gapy++; 

ngapy -= siz; 
/* ignore MAXGAP when doing endgaps */ 
30 siz = (-siz < MAXGAP || endgaps)? -siz : MAXGAP; 

il++; 

} 

else if (siz > 0) { /* gap in first seq */ 
pp[0].n[i0] = siz; 
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pp[0].x[i0] =xx; 
gapx++; 
ngapx += siz; 

/* ignore MAXGAP when doing endgaps */ 

siz = (siz < MAXGAP || endgaps)? siz : MAXGAP; 
iO+4-; 

} 

} 

else 

break; 

} 



/* reverse the order of jmps 
*/ 

for (j = 0, i0~; j < i0; j++, i0~) { 

i = pp[0].n[j]; pp[0].n[j] = pp[0].n[i0]; pp[0].n[i0] = i; 
i = pp[0].x[j]; pp[0].x[j] = pp[0].x[i0]; pp[0].x[i0] = i; 

} 

fora = 0,il-;j<il;j++,il--){ 

i = pp[l].n[j3; pp[l].nO] = pp[l].n[il]; pp[l].n[il] = i; 
i = pp[l].x[j]; pp[l]-x[j] = pp[l].x[il]; pp[l].x[il] = 1; 

} 

if (f d >= 0) 

(void) close(fd); 

if(fj){ 

(void) unlink(jname); 
fj = 0; 
offset = 0; 

} 
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/* 

* write a filled jmp struct offset of the prev one (if any): nwQ 
*/ 

writejmps(ix) writejmps 
int ix; 

{ 

char *mktemp(); 

if(!fj){ 

if (mktemp(jname) < 0) { 

fprintf(stderr, "%s: can't mktempO %s\n ff , prog, jname); 
cleanup(l); 

} 

if ((fj = fopen(jname, V)) == 0) { 

fprintf(stderr, "%s: can't write %s\n", prog, jname); 
exit(l); 

} 

} 

(void) fwrite((char *)&dx[ix].jp, sizeof (struct jmp), 1, fj); 
(void) fwrite((char *)&dx[ix]. offset, sizeof (dx[ix]. off set), 1, fj); 

} 
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Example calculations for determining % amino acid sequence identity and nucleic acid 

sequence identity: 

1. 

PRO XXXXXXXXXXXXXXX (Length = 15 amino acids) 

5 Comparison Protein XXXXXYYYYYYY (Length = 12 amino acids) 

% amino acid sequence identity = 

(the number of identically matching amino acid residues between the two polypeptide 
10 sequences as determined by ALIGN-2) divided by (the total number of amino acid residues of 
the PRO polypeptide) = 

5 divided by 15 = 33.3% 

15 2. 

PRO XXXXXXXXXX (Length = 10 amino acids) 

Comparison Protein XXXXXYYYYYYZZYZ (Length = 15 amino acids) 

% amino acid sequence identity = 

20 

(the number of identically matching amino acid residues between the two polypeptide 
sequences as determined by ALIGN-2) divided by (the total number of amino acid residues of 
the PRO polypeptide) = 

25 5 divided by 10 = 50% 

3. 

PRO-DNA NJNWNN]SnSfNNNNNNN (Length = 14 nucleotides) 

Comparison DNA NNNNNNLLLLLLLLLL (Length = 16 nucleotides) 

30 

% nucleic acid sequence identity = 
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(the number of identically matching nucleotides between the two nucleic acid sequences as 
determined by ALIGN-2) divided by (the total number of nucleotides of the PRO-DNA 
nucleic acid sequence) = 

5 6 divided by 14 = 42.9% 

4. 

PRO-DNA NNNNNNNNNNNN (Length = 12 nucleotides) 

Comparison DNA NNNNLLLVV (Length = 9 nucleotides) 

10 

% nucleic acid sequence identity = 

(the number of identically matching nucleotides between the two nucleic acid sequences as 
determined by ALIGN-2) divided by (the total number of nucleotides of the PRO-DNA 
15 nucleic acid sequence) = 

4 divided by 12 = 33.3% 

20 Although the foregoing refers to particular embodiments, it will be understood that the 

present invention is not so limited. It will occur to those of ordinary skill in the art that 
various modifications may be made to the disclosed embodiments without diverting from the 
overall concept of the invention. All such modifications are intended to be within the scope of 
the present invention. 

25 

What is claimed is: 
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CLAIMS 

1. A method of detecting of high-grade dysplasia (HGD) in cells of a tissue sample, the 
method comprising: 

5 (a) obtaining a test tissue sample suspected of comprising cells exhibiting HGD; 

(b) establishing the level of expression in the test tissue sample of at least eight genes 
selected from the group consisting of ET-1 (endothelin-1, NM_001955) (SEQ ID NO:l); 
AGR2 (anterior gradient 2 (Xenepus laevis) homolog, NM_006408) (SEQ ID NO:3); ADAM8 
(NM_001109) (SEQ ID NO:5); PRSS8 (Prostasin precursor, serine protease, NM_002773) 

10 (SEQ ID NO:7); AXOl (Axonin-1 precursor, NM_005076) (SEQ ID NO:9); NROB2 

(Nuclear hormone receptor, NMJ)21969) (SEQ ID NO: 11); TM7SF1 (NM_003272) (SEQ ID 
NO:13); DLDH (dihydrolipamide dehydrogenase, NM_000108) (SEQ ID NOS:15); MAT2B 
(methionine adenosyltransferase II, beta, NM_013283) (SEQ ID NO: 17); STC-2 
(stanniocalcin-2, NM_003714) (SEQ ID NO: 19); PPBI (alkaline phosphatase, intestinal 

15 precursor, NML001631) (SEQ ID NO:21); SLNAC1 (sodium channel receptor SLNAC1, 

NM_004769) (SEQ ID NO:23); CAH4 (carbonic anhydrase iv precursor, NM„000717) (SEQ 
ID NO:25); PA21 (phopholipase a2 precursor, NMJ300928) (SEQ ID NO:27); PAR2 
(proteinase activated receptor 2 precursor, NM_005242) (SEQ ID NO:29); IDE (insulin- 
degrading enzyme, NMJ304969) (SEQ ID NO:31); MYOIA (myosin-lA, NM_005379) (SEQ 

20 ID NO:33); CYP2J2 (cytochrome P450 monooxygenase, NM_000775) (SEQ ID NO:35); 

PHYH (phytanoyl-CoA-hydroxylase (Refsum disease), NM_006214) (SEQ ID NO:37); CYB5 
(cytochrome b5, 3' end, NM_001914) (SEQ ID NO:39); COXVIb (coxVIb gene, last exon 
and flanking sequence, NMJXH863) (SEQ ID NO:41); and TCF4 (NM_030756) (SEQ ID 
NO:43), or variants thereof having at least 80% nucleic acid sequence identity, wherein the 

25 tissue is from esophagus or colon; and 

(c) comparing expression of the at least eight genes to a baseline expression of the 
genes in normal tissue controls of the same tissue type, wherein an increase of at least 1.5-fold 
in expression of the genes relative to the baseline expression indicates that cells of the test 
sample exhibit HGD. 

30 

2. The method of claim 1 ? wherein the tissue is human tissue. 



3. A method of identifying a esophageal tissue susceptable to esophageal adenocarcoma, 
comprising detecting esophageal HGD in a test tissue sample according to claim 1. 
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4. A method according to claim 1, wherein an increase of at least 2-fold in expression of genes 
relative to the baseline is observed. 

5 5. A method according to claim 1, wherein at least one of the at least eight genes is selected 
from the group consisting of AGR2 (SEQ ID NO:3), TM7SF1 (SEQ ID NO: 13), MAT2B 
(SEQ ID NO:17), SLNAC1 (SEQ ID NO:23), and TCF4 (SEQ ID NO:43), or variants thereof 
having at least 80% nucleic acid sequence identity. 

10 6. A method for determining predisposition of a mammalian tissue to a neo-plastic 

transformation by detecting HGD in cells of the tissue, the method comprising determining in 
a cell from the tissue expression of a nucleic acid sequence of at least eight genes selected 
from the group consisting of ET-1 (endothelin-1, NM_001955) (SEQ ID NO:l); AGR2 
(anterior gradient 2 (Xenepus laevis) homolog, NM_006408) (SEQ ID NO:3); ADAM8 

15 (NM_001 109) (SEQ ID NO:5); PRSS8 (Prostasin precursor, serine protease, NM_002773) 
(SEQ ID NO:7); AXOl (Axonin-1 precursor, NM_005076) (SEQ ID NO:9); NROB2 
(Nuclear hormone receptor, NM_021969) (SEQ ID NO: 11); TM7SF1 (NM_003272) (SEQ ID 
NO: 13); DLDH (dihydrolipamide dehydrogenase, NM_000108) (SEQ ID NOS:15); MAT2B 
(methionine adenosyltransferase II, beta, NM_013283) (SEQ ID NO:17); STC-2 

20 (stanniocalcin-2, NM_0037 14) (SEQ ID NO: 19); PPBI (alkaline phosphatase, intestinal 
precursor, NM_001631) (SEQ ID NO:21); SLNAC1 (sodium channel receptor SLNAC1, 
NM_004769) (SEQ ID NO:23); CAH4 (carbonic anhydrase iv precursor, NM_000717) (SEQ 
ID NO:25); PA21 (phopholipase a2 precursor, NM_000928) (SEQ ID NO:27); PAR2 
(proteinase activated receptor 2 precursor, NM_005242) (SEQ ID NO:29); IDE (insulin- 

25 degrading enzyme, NM_004969) (SEQ ID NO:3 1); MYOIA (myosin- 1 A, NM_005379) (SEQ 
ID NO:33); CYP2J2 (cytochrome P450 monooxygenase, NM_000775) (SEQ ID NO:35); 
PHYH (phytanoyl-CoA-hydroxylase (Refsum disease), NM_006214) (SEQ ID NO:37); CYB5 
(cytochrome b5, 3' end, NM_001914) (SEQ ID NO:39); COXVIb (coxVIb gene, last exon 
and flanking sequence, NM_001863) (SEQ ID NO:41); and TCF4 (NM_030756) (SEQ ID 

30 NO:43), or variants thereof having at least 80% nucleic acid sequence identity, wherein the 

tissue of from esophagus or colon, and wherein the expression in the test sample is at least 1.5- 
fold above baseline expression in a normal tissue control of the same tissue type. 

7. A method according to claim 6, wherein the tissue is human tissue. 
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8. A method according to claim 6, wherein at least one of the at least eight genes is selected 
from the group consisting of AGR2 (SEQ ID NO:3), TM7SF1 (SEQ ID NO: 13), MAT2B 
(SEQ ID NO: 17), SLNAC1 (SEQ ID NO:23), and TCF4 (SEQ ID NO:43), or variants thereof 
5 having at least 80% nucleic acid sequence identity. 



9. A method of detecting high-grade dysplasia (HGD) in cells of a mammalian tissue sample, 
the method comprising: 
10 (a) obtaining a test tissue sample suspected of comprising cells exhibiting HGD; 

(b) establishing the level of expression in the test tissue sample of at least eight 
polypeptides encoded by genes selected from the group consisting of ET-1 (endothelin-1, 
NM_001955) (SEQ ID NO:l); AGR2 (anterior gradient 2 (Xenepus laevis) homolog, 
NMJ)06408) (SEQ ID NO:3); ADAM8 (NM_001109) (SEQ ID NO:5); PRSS8 (Prostasin 

15 precursor, serine protease, NM_002773) (SEQ ID NO:7); AXOl (Axonin-1 precursor, 

NMJ)05076) (SEQ ID NO:9); NROB2 (Nuclear hormone receptor, NM_021969) (SEQ ID 
NO: 11); TM7SF1 (NM_003272) (SEQ ID NO: 13); DLDH (dihydrolipamide dehydrogenase, 
NM_000108) (SEQ ID NOS:15); MAT2B (methionine adenosyltransferase II, beta, 
NM_013283) (SEQ ID NO: 17); STC-2 (stanniocalcin-2, NM_003714) (SEQ ID NO: 19); 

20 PPBI (alkaline phosphatase, intestinal precursor, NM_00 1631) (SEQ ID NO:21); SLNAC1 
(sodium channel receptor SLNAC1, NM_004769) (SEQ ID NO:23); CAH4 (carbonic 
anhydrase iv precursor, NM_000717) (SEQ ID NO:25); PA21 (phopholipase a2 precursor, 
NM_000928) (SEQ ID NO:27); PAR2 (proteinase activated receptor 2 precursor, 
NM_005242) (SEQ ID NO:29); IDE (insulin-degrading enzyme, NM_004969) (SEQ ID 

25 NO:31); MYOIA (myosin-lA, NM_005379) (SEQ ID NO:33); CYP2J2 (cytochrome P450 
monooxygenase, NM_000775) (SEQ ID NO:35); PHYH (phytanoyl-CoA-hydroxylase 
(Refsum disease), NMJ306214) (SEQ ID NO:37); CYB5 (cytochrome b5, 3' end, 
NM_001914) (SEQ ID NO: 39); COXVIb (coxVIb gene, last exon and flanking sequence, 
NML001863) (SEQ ID NO:41); and TCF4 (NM_030756) (SEQ ID NO:43), or variants thereof 

30 having at least 80% nucleic acid sequence identity, wherein the tissue is from esophagus or 
colon; and 

(c) comparing expression of the at least eight polypeptides in the test tissue sample to 
expression of the at least eight polypeptides in normal tissue controls of the same tissue type, 
wherein an increase of at least 1.5-fold in expression of the polypeptides in the test tissue 
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sample relative to the normal tissue controls indicates that cells of the test sample exhibit 
HGD. 

10. A method as according to claim 9 comprising contacting the test tissue sample with an 
5 antibody that specifically binds one of the at least eight polypeptides under conditions that 

permit the antibody to bind the polypeptide. 

11. A method according to claim 9, wherein at least one of the at least eight polypeptides 
expressed by a gene selected from the group consisting of AGR2 (SEQ ID NO:3), TM7SF1 

10 (SEQ ID NO:13), MAT2B (SEQ ID NO: 17), SLNAC1 (SEQ ID NO:23), and TCF4 (SEQ ID 
NO:43), or variants thereof having at least 80% nucleic acid sequence identity. 

12. The method of claim 1, wherein gene expression is determined by nucleic acid microarray 
15 analysis. 

13. The method of claim 12, wherein analysis comprises contacting nucleic acid from a test 
tissue sample with a nucleic acid microarray comprising nucleic acid probe sequences, 
wherein at least eight of the nucleic acid probe sequences separately comprises at least 50 

20 contiguous nucleotides from a gene selected from the group consisting of ET-1 (endothelin-1, 
NM_001955) (SEQ ID NO:l); AGR2 (anterior gradient 2 (Xenepus laevis) homolog, 
NM_006408) (SEQ ID NO:3); ADAM8 (NM_001109) (SEQ ID NO:5); PRSS8 (Prostasin 
precursor, serine protease, NM_002773) (SEQ ID NO:7); AXOl (Axonin-1 precursor, 
NMJ305076) (SEQ ID NO:9); NROB2 (Nuclear hormone receptor, NM_021969) (SEQ ID 

25 NO: 1 1); TM7SF1 (NDML003272) (SEQ ID NO: 13); DLDH (dihydrolipamide dehydrogenase, 
NM_000108) (SEQ ID NOS:15); MAT2B (methionine adenosyltransferase II, beta, 
NMJH3283) (SEQ ID NO: 17); STC-2 (stanniocalcin-2, NM_003714) (SEQ ID NO: 19); 
PPBI (alkaline phosphatase, intestinal precursor, NM_001631) (SEQ ID NO:21); SLNAC1 
(sodium channel receptor SLNAC1, NM_004769) (SEQ ID NO:23); CAH4 (carbonic 

30 anhydrase iv precursor, NM_000717) (SEQ ID NO:25); PA21 (phopholipase a2 precursor, 
NML000928) (SEQ ID NO:27); PAR2 (proteinase activated receptor 2 precursor, 
NM__005242) (SEQ ID NO:29); IDE (insulin-degrading enzyme, NM_004969) (SEQ ID 
NO:31); MYOIA (myosin-lA, NM_005379) (SEQ ID NO:33); CYP2J2 (cytochrome P450 
monooxygenase, NM_000775) (SEQ ID NO:35); PHYH (phytanoyl-CoA-hydroxylase 
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(Refsum disease), NM_006214) (SEQ ID NO:37); CYB5 (cytochrome b5 5 3' end, 
NMJ301914) (SEQ ID NO:39); COXVIb (coxVIb gene, last exon and flanking sequence, 
NMJ)01863) (SEQ ID NO:41); and TCF4 (NM_030756) (SEQ ID NO:43), or variants thereof 
having at least 80% nucleic acid sequence identity.. 

5 

14. The method of claim 13, wherein the at least eight nucleic acid probe sequences comprise 
at least 60 contiguous nucleotides from a gene selected from the group. 

15. The method of claim 14, wherein the at least eight nucleic acid probe sequences comprise 
10 at least 80 contiguous nucleotides from a gene selected from the group. 

16. The method of claim 15, wherein the at least eight nucleic acid probe sequences comprise 
at least 100 contiguous nucleotides from a gene selected from the group. 

15 17. The method of claim 16, wherein the at least eight nucleic acid probe sequences comprise 
at least 150 contiguous nucleotides from a gene selected from the group. 

18. The method of claim 17, wherein the at least eight nucleic acid probe sequences comprise 
at least 200 contiguous nucleotides from a gene selected from the group. 

20 

19. The method of claim 13, wherein the nucleic acid microarray comprises nucleic acid 
probe sequences from at least ten genes selected from the group. 

20. The method of claim 19, wherein the nucleic acid microarray comprises nucleic acid 
25 probe sequences from at least twelve genes selected from the group. 

21. The method of claim 20, wherein the nucleic acid microarray comprises nucleic acid 
probe sequences from at least fifteen genes selected from the group. 

30 22. The method of claim 21, wherein the nucleic acid microarray comprises nucleic acid 
probe sequences from at least eighteen genes selected from the group. 

23. The method of claim 22, wherein the nucleic acid microarray comprises nucleic acid 
probe sequences from at least twenty genes selected from the group. 

136 



WO 2004/044178 



PCT/US2003/036260 



24. The method of claim 23, wherein the nucleic acid microarray comprises nucleic acid 
probe sequences from at least twenty two genes selected from the group. 

5 25. The method of claim 1, wherein gene expression is determined by nucleic acid 

hybridization under high stringency conditions of a detectable probe comprising at least 50 
contiguous nucleotides from a gene selected from the group to nucleic acid of cells of the test 
tissue sample relative to cells of the normal tissue control. 

10 26. The method of claim 25, wherein the hybridization is in situ hybridization. 

27. The method of claim 26, wherein the hybridization is fluorescent in situ hybridization. 

28. The method of claim 1, wherein gene expression is determined by polymerase chain 
15 reaction (PCR) analysis. 

29. The method of claim 1, wherein gene expression is determined by real-time polymerase 
chain reaction (RT-PCR) analysis. 

20 30. The method of claim 1, wherein gene expression is determined by Taqman® polymerase 
chain reaction analysis. 

31. A kit comprising a microarray, the microarray comprising nucleic acid probe sequences, 
wherein at least eight of the nucleic acid probe sequences each comprise at least 50 contiguous 

25 nucleotides from a gene selected from the group consisting of ET-1 (endothelin-1, 

NM_001955) (SEQ ID NO:l); AGR2 (anterior gradient 2 (Xenepus laevis) homolog, 
NM_006408) (SEQ ID NO:3); ADAM8 (NM_001109) (SEQ ID NO:5); PRSS8 (Prostasin 
precursor, serine protease, NM_002773) (SEQ ID NO:7); AXOl (Axonin-1 precursor, 
NM_005076) (SEQ ID NO:9); NROB2 (Nuclear hormone receptor, NM_021969) (SEQ ID 

30 NO: 11); TM7SF1 (NM„003272) (SEQ ID NO: 1 3); DLDH (dihydrolipamide dehydrogenase, 
NMJ)00108) (SEQ ID NOS:15); MAT2B (methionine adenosyltransferase n, beta, 
NMJ)13283) (SEQ ID NO:17); STC-2 (stanniocalcin-2, NM_003714) (SEQ ID NO:19); 
PPBI (alkaline phosphatase, intestinal precursor, NM_001631) (SEQ ID NO:21); SLNAC1 
(sodium channel receptor SLNAC1, NM_004769) (SEQ ID NO:23); CAH4 (carbonic 
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anhydrase iv precursor, NMJ300717) (SEQ ID NO:25); PA21 (phopholipase a2 precursor, 
NMJ300928) (SEQ ID NO:27); PAR2 (proteinase activated receptor 2 precursor, 
NM_005242) (SEQ ID NO:29); IDE (insulin-degrading enzyme, NM_004969) (SEQ ID 
NO:31); MYOIA (myosin-lA, NM_005379) (SEQ ID NO:33); CYP2J2 (cytochrome P450 
5 monooxygenase, NM_000775) (SEQ ID NO:35); PHYH (phytanoyl-CoA-hydroxylase 
(Refsum disease), NM_006214) (SEQ ID NO:37); CYB5 (cytochrome b5, 3 ? end, 
NM_001914) (SEQ ID NO:39); COXVIb (coxVIb gene, last exon and flanking sequence, 
NM.001863) (SEQ ID NO:41); and TCF4 (NM_030756) (SEQ ID NO:43), or variants thereof 
having at least 80% nucleic acid sequence identity, and a package insert indicating that the 
10 microarray is for use in detecting HGD in a test tissue sample, wherein the tissue is from 

esophagus or colon, and wherein an increase in expression in the test tissue sample of at least 
1.5-fold of the at least eight genes relative to a normal tissue control of the same tissue type 
indicates that cells of the test tissue exhibit HGD. 

15 32. The kit of claim 31, wherein the nucleic acid probe sequences each comprise at least 60 
contiguous nucleotides from a gene selected from the group. 

33. The kit of claim 32, wherein the nucleic acid probe sequences each comprise at least 80 
contiguous nucleotides from a gene selected from the group. 

20 

34. The kit of claim 33, wherein the nucleic acid probe sequences each comprise at least 100 
contiguous nucleotides from a gene selected from the group. 

35. The kit of claim 34, wherein the nucleic acid probe sequences each comprise at least 150 
25 contiguous nucleotides from a gene selected from the group. 

36. The kit of claim 35, wherein the nucleic acid probe sequences each comprise at least 200 
contiguous nucleotides from a gene selected from the group. 

30 37. A method of detecting cancer in a patient, the method comprising: 

(a) obtaining a test tissue sample from the patient; 

(b) establishing the level of expression of a gene selected from the group consisting of 
CAD17 (liver-intestine cadherin, NML004063) (SEQ ID NO:45), CLDN15 (claudin 15, 
NMJH4343) (SEQ ID NO:47), SLNAC1 (sodium channel, NM_004769) (SEQ ID NO:23), 
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CFTR (chloride channel, NM_000492) (SEQ ID NO:49), H2R (histamine H2 receptor, 
NM_022304) (SEQ ID NO:51), PRSS8 (serine protease, NM 002773) (SEQ ID NO:7), PA21 
(phospholipase A2 group IB, NM 000928) (SEQ ID NO:27), AGR2 (anterior gradient 2 
homolog, (NM_006408) (SEQ ID NO:3), EGFR (NM_005228) (SEQ ID NO:53), EPHB2 
(NM_004442) (SEQ ID NO:55), CR1PTO CR-1 (NMJD03212) (SEQ ID NO:57), Eprin Bl 
(NM_004429) (SEQ ID NO:59), MMP- 1 7/MT4-MMP (NM_016155) (SEQ ID NO:61), 
MMP26 (NM_021801) (SEQ ID NO:63), ADAM10 (NM_001 1 10) (SEQ ID NO:65), 
ADAM8 (NM_001 109) (SEQ ID NO:5), AD AMI (XM_132370) (SEQ ID NO:67), TIM1 
(NM.003254) (SEQ ID NO:69), MUC1 (XM_053256) (SEQ ID NO:71), CEA (NM_004363) 
(SEQ ID NO:73), NCA (NM_002483) (SEQ ED NO:75), Follistatin (NM_006350) (SEQ ID 
NO:77), Claudin 1 (NM_021101) (SEQ ID NO:79), Claudin 14 (NM_012130) (SEQ ID 
NO:81), tenascin-R (NM_003285) (SEQ ID NO:83), CAD3 (NM_001793) (SEQ ID NO:85), 
AXOl (NM_005076) (SEQIDNO:9), CONT (NM_001843) (SEQ ID NO:87), Osteopontin 
(NM_000582) (SEQ ID NO:89), Galectin 8 (NM_006499) (SEQ ID NO:91), PGS1 (bihlycan, 
NM_001711) (SEQ ID NO:93), Frizzled 2 (NM_001466) (SEQ ID NO.:95), ISLR 
(NM_005545) (SEQ ID NO:97), FLJ23399 (NM_022763) (SEQ ID NO:99), TEM1 
(NM_020404) (SEQ ID NO: 101), Tie2 ligand2 (NM_001147) (SEQ ID NO: 103), STC-2 
(NM_003714) (SEQ ID NO: 19), VEGFC (NM_005429) (SEQ ID NO: 105), tPA 
(NM_000930) (SEQ ID NO: 107), Endothelin 1 (NM_001955) (SEQ ID NO:l), 
Thrombomodulin (NM_000361) (SEQ ID NO: 109), TF (NM_001993) (SEQ ID NO: 111), 
GPR4 (NM 005282) (SEQ ID NO: 113), GPR66 (NM_006056) (SEQ ID NO: 115), SLC22A2 
(NM_003058) ((SEQ ID NO: 117), MLSN1 (NM 002420) (SEQ ID NO: 119), and ATN2 
(Na/K transport, NMJ)00702) (SEQ ID NO: 121), or variants thereof having at least 80% 
nucleic acid sequence identity, wherein the test tissue is from esophagus or colon; and wherein 
the expressing in the test tissue is at a level at least 1.5-fold above expression of the gene in a 
normal tissue control of the same tissue type. 

38. The method of claim 37, wherein inhibition of cell growth is cell death. 

39. The method of claim 37, wherein at least two genes selected from the group are expressed 
at a level at least 1.5-fold above expression of the gene in a normal cell control. 

40. The method of claim 39, wherein at least three genes selected from the group are 
expressed at a level at least 1.5-fold above expression of the gene in a normal cell control. 
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41. The method of claim 40, wherein at least 5 genes selected from the group are expressed at 
a level at least 1.5-fold above expression of the gene in a normal cell control. 

5 42. The method of claim 41, wherein at least 8 genes selected from the group are expressed at 
a level at least 1.5-fold above expression of the gene in a normal cell control. 

43. The method of claim 1, wherein the expression p value is less than 0.07. 

10 44. The method of claim 6, wherein the expression p value is less than 0.07. 

45. The method of claim 9, wherein the expression p value is less than 0.07. 
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Figure 1A 
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Figure 2A 
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Figure 2B 
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Figure 3 A 
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Figure 3B 
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ET-l (endothelin-1, NM_001955) 

1 cgccgcgtgc gcctgcagac gctccgctcg ctgccttctc tcctggcagg cgctgccttt 

61 tctccccgtt aaagggcact tgggctgaag gatcgctttg agatctgagg aacccgcagc 

121 gctttgaggg acctgaagct gtttttcttc gttttccttt gggttcagtt tgaacgggag 

181 gtttttgatc cctttttttc agaatggatt atttgctcat gattttctct ctgctgtttg 

241 tggcttgcca aggagctcca gaaacagcag tcttaggcgc tgagctcagc gcggtgggtg 

3 01 agaacggcgg ggagaaaccc actcccagtc caccctggcg gctccgccgg tccaagcgct 
361 gctcctgctc gtccctgatg gataaagagt gtgtctactt ctgccacctg gacatcattt 
421 gggtcaacac tcccgagcac gttgttccgt atggacttgg aagccctagg tccaagagag 

4 81 ccttggagaa tttacttccc acaaaggcaa cagaccgtga gaatagatgc caatgtgcta 
541 gccaaaaaga caagaagtgc tggaattttt gccaagcagg aaaagaactc agggctgaag 
6 01 acattatgga gaaagactgg aataatcata agaaaggaaa agactgttcc aagcttggga 
6 61 aaaagtgtat ttatcagcag ttagtgagag gaagaaaaat cagaagaagt tcagaggaac 
721 acctaagaca aaccaggtcg gagaccatga gaaacagcgt caaatcatct tttcatgatc 
781 ccaagctgaa aggcaatccc tccagagagc gttatgtgac ccacaaccga gcacattggt 
841 gacagacctt cggggcctgt ctgaagccat agcctccacg gagagccctg tggccgactc 
901 tgcactctcc accctggctg ggatcagagc aggagcatcc tctgctggtt cctgactggc 
961 aaaggaccag cgtcctcgtt caaaacattc caagaaaggt taaggagttc ccccaaccat 

1021 cttcactggc ttccatcagt ggtaactgct ttggtctctt ctttcatctg gggatgacaa 
1081 tggacctctc agcagaaaca cacagtcaca ttcgaattcg ggtggcatcc tccggagaga 
1141 gagagaggaa ggagattcca cacaggggtg gagtttctga cgaaggtcct aagggagtgt 
12 01 ttgtgtctga ctcaggcgcc tggcacattt cagggagaaa ctccaaagtc cacacaaaga 
12 61 ttttctaagg aatgcacaaa ttgaaaacac actcaaaaga caaacatgca agtaaagaaa 
1321 aaaaaaaaaa aaaa (SEQ ID NO:l) 



FIGURE 4A 



ET-l (endothelin-1, NM__001955) 



MDYLLMIFSLLFVACQGAPETAVLGAELSAVGENGGEKPTPSPP 

RLRRSKRCSCSSLMDKECVYFCHLDIIWWTPEHWPYGLGSPRSKRALENLIiPTKA 
TDRENRCQCASQKDKKCWNFCQAGKELRAEDIMEKDWNISrHKKGKDCSKLGKKCIYQQL 
VRGRKIRRSSEEHLRQTRSETMRNSVKSSFHDPKLKGNPSRERYVTHNRAHW (SEQ ID NO : 2 ) 
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AGR2 (anterior gradient 2 (Xenepus laevis) homolog, NM_006408) 

1 ccgcatccta gccgccgact cacacaaggc aggtgggtga ggaaatccag agttgccatg 
61 gagaaaattc cagtgtcagc attcttgctc cttgtggccc tctcctacac tctggccaga 
121 gataccacag tcaaacctgg agccaaaaag gacacaaagg actctcgacc caaactgccc 
181 cagaccctct ccagaggttg gggtgaccaa ctcatctgga ctcagacata tgaagaagct 
241 ctatataaat ccaagacaag caacaaaccc ttgatgatta ttcatcactt ggatgagtgc 
3 01 ccacacagtc aagctttaaa gaaagtgttt gctgaaaata aagaaatcca gaaattggca 
361 gagcagtttg tcctcctcaa tctggtttat gaaacaactg acaaacacct ttctcctgat 
421 ggccagtatg tccccaggat tatgtttgtt gacccatctc tgacagttag agccgatatc 
481 actggaagat attcaaatcg tctctatgct tacgaacctg cagatacagc tctgttgctt 
541 gacaacatga agaaagctct caagttgctg aagactgaat tgtaaagaaa aaaaatctcc 
601 aagcccttct gtctgtcagg ccttgagact tgaaaccaga agaagtgtga gaagactggc 
661 tagtgtggaa gcatagtgaa cacactgatt aggttatggt ttaatgttac aacaactatt 
721 ttttaagaaa aacaagtttt agaaatttgg tttcaagtgt acatgtgtga aaacaatatt 
781 gtatactacc atagtgagcc atgattttct aaaaaaaaaa ataaatgttt tgggggtgtt 
841 ctgttttctc caacttggtc tttcacagtg gttcgtttac caaataggat taaacacaca 
901 caaaatgctc aaggaaggga caagacaaaa ccaaaactag ttcaaatgat gaagaccaaa 
961 gaccaagtta tcatctcacc acaccacagg ttctcactag atgactgtaa gtagacacga 
1021 gcttaatcaa cagaagtatc aagccatgtg ctttagcata aaagaatatt tagaaaaaca 
1081 tcccaagaaa atcacatcac tacctagagt caactctggc caggaactct aaggtacaca 
1141 ctttcattta gtaattaaat tttagtcaga ttttgcccaa cctaatgctc tcagggaaag 

12 01 cctctggcaa gtagctttct ccttcagagg tctaatttag tagaaaggtc atccaaagaa 
1261 catctgcact cctgaacaca ccctgaagaa atcctgggaa ttgaccttgt aatcgatttg 
1321 tctgtcaagg tcctaaagta ctggagtgaa ataaattcag ccaacatgtg actaattgga 

13 81 agaagagcaa agggtggtga cgtgttgatg aggcagatgg agatcagagg ttactagggt 
1441 ttaggaaacg tgaaaggctg tggcatcagg gtaggggagc attctgccta acagaaatta 
1501 gaattgtgtg ttaatgtctt cactctatac ttaatctcac attcattaat atatggaatt 
1561 cctctactgc ccagcccctc ctgatttctt tggcccctgg actatggtgc tgtatataat 
1621 gctttgcagt atctgttgct tgtcttgatt aacttttttg gataaaacct tttttgaaca 
1681 gaaaaaaaaa aaaaaaaaaa a (SEQ ID NO: 3) 



FIGURE 5A 



AGR2 (anterior gradient 2 (Xenepus laevis) homolog, NM_006408) 



MEKIPVSAFLLLVALSYTLARDTTVKPGAKKDTKDSRPKLPQTL 

SRGWGDQLIWTQTYEEALYKSKTSNKPLMIIHHLDECPHSQALKKVFAENKEIQKLiAE 
QFVLLNLVYETTDKHLSPDGQYVPRIMFVDPSLTVRADITGRYSNRLYAYEPADTALIi 
LDMMKKALKXjLKTEIi (SEQ ID NO: 4) 
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ADAM 8 (NM_001109) 

1 gacccggcca tgcgcggcct cgggctctgg ctgctgggcg cgatgatgct gcctgcgatt 
61 gcccccagcc ggccctgggc cctcatggag cagtatgagg tcgtgttgcc gcggcgtctg 
121 ccaggccccc gagtccgccg agctctgccc tcccacttgg gcctgcaccc agagagggtg 
181 agctacgtcc ttggggccac agggcacaac ttcaccctcc acctgcggaa gaacagggac 
241 ctgctgggtt ccggctacac agagacctat acggctgcca atggctccga ggtgacggag 
3 01 cagcctcgcg ggcaggacca ctgcttatac cagggccacg tagaggggta cccggactca 

3 61 gccgccagcc tcagcacctg tgccggcctc aggggtttct tccaggtggg gtcagacctg 
421 cacctgatcg agcccctgga tgaaggtggc gagggcggac ggcacgccgt gtaccaggct 

4 81 gagcacctgc tgcagacggc cgggacctgc ggggtcagcg acgacagcct gggcagcctc 
541 ctgggacccc ggacggcagc cgtcttcagg cctcggcccg gggactctct gccatcccga 
6 01 gagacccgct acgtggagct gtatgtggtc gtggacaatg cagagttcca gatgctgggg 
6 61 agcgaagcag ccgtgcgtca tcgggtgctg gaggtggtga atcacgtgga caagctatat 
721 cagaaactca acttccgtgt ggtcctggtg ggcctggaga tttggaatag tcaggacagg 
781 ttccacgtca gccccgaccc cagtgtcaca ctggagaacc tcctgacctg gcaggcacgg 
841 caacggacac ggcggcacct gcatgacaac gtacagctca tcacgggtgt cgacttcacc 
9 01 gggactactg tggggtttgc cagggtgtcc gccatgtgct cccacagctc aggggctgtg 
961 aaccaggacc acagcaagaa ccccgtgggc gtggcctgca ccatggccca tgagatgggc 

1021 cacaacctgg gcatggacca tgatgagaac gtccagggct gccgctgcca ggaacgcttc 
10 81 gaggccggcc gctgcatcat ggcaggcagc attggctcca gtttccccag gatgttcagt 
1141 gactgcagcc aggcctacct ggagagcttt ttggagcggc cgcagtcggt gtgcctcgcc 
12 01 aacgcccctg acctcagcca cctggtgggc ggccccgtgt gtgggaacct gtttgtggag 

12 61 cgtggggagc agtgcgactg cggccccccc gaggactgcc ggaaccgctg ctgcaactct 

13 21 accacctgcc agctggctga gggggcccag tgtgcgcacg gtacctgctg ccaggagtgc 
13 81 aaggtgaagc cggctggtga gctgtgccgt cccaagaagg acatgtgtga cctcgaggag 
1441 ttctgtgacg gccggcaccc tgagtgcccg gaagacgcct tccaggagaa cggcacgccc 
1501 tgctccgggg gctactgcta caacggggcc tgtcccacac tggcccagca gtgccaggcc 
1561 ttctgggggc caggtgggca ggctgccgag gagtcctgct tctcctatga catcctacca 
1621 ggctgcaagg ccagccggta cagggctgac atgtgtggcg ttctgcagtg caagggtggg 
1681 cagcagcccG tggggcgtgc catctgcatc gtggatgtgt gccacgcgct caccacagag 
1741 gatggcactg cgtatgaacc agtgcccgag ggcacccggt gtggaccaga gaaggtttgc 
1801 tggaaaggac gttgccagga cttacacgtt tacagatcca gcaactgctc tgcccagtgc 
1861 cacaaccatg gggtgtgcaa ccacaagcag gagtgccact gccacgcggg ctgggccccg 
1921 ccccactgcg cgaagctgct gactgaggtg cacgcagcgt ccgggagcct ccccgtcctc 
1981 gtggtggtgg ttctggtgct cctggcagtt gtgctggtca ccctggcagg catcatcgtc 
2 041 taccgcaaag cccggagccg catcctgagc aggaacgtgg ctcccaagac cacaatgggg 
2101 cgctccaacc ccctgttcca ccaggctgcc agccgcgtgc cggccaaggg cggggctcca 
2161 gccccatcca ggggccccca agagctggtc cccaccaccc acccgggcca gcccgcccga 
2221 cacccggcct cctcggtggc tctgaagagg ccgccccctg ctcctccggt cactgtgtcc 
22 81 agcccaccct tcccagttcc tgtctacacc cggcaggcac caaagcaggt catcaagcca 
2341 acgttcgcac ccccagtgcc cccagtcaaa cccggggctg gtgcggccaa ccctggtcca 
2401 gctgagggtg ctgttggccc aaaggttgcc ctgaagcccc ccatccagag gaagcaagga 
2461 gccggagctc ccacagcacc ctaggggggc acctgcgcct gtgtggaaat ttggagaagt 
2 521 tgcggcagag aagccatgcg ttccagcctt ccacggtcca gctagtgccg ctcagcccta 
2581 gaccctgact ttgcaggctc agctgctgtt ctaacctcag taatgcatct acctgagagg 
2641 ctcctgctgt ccacgccctc agccaattcc ttctccccgc cttggccacg tgtagcccca 
2 701 gctgtctgca ggcaccaggc tgggatgagc tgtgtgcttg cgggtgcgtg tgtgtgtacg 
2761 tgtctccagg tggccgctgg tctcccgctg tgttcaggag gccacatata cagcccctcc 
2 821 cagccacacc tgcccctgct ctggggcctg ctgagccggc tgccctgggc acccggttcc 
2 881 aggcagcaca gacgtggggc atccccagaa agactccatc ccaggaccag gttcccctcc 

2 941 gtgctcttcg agagggtgtc agtgagcaga ctgcacccca agctcccgac tccaggtccc 

3 0 01 ctgatcttgg gcctgtttcc catgggattc aagagggaca gccccagctt tgtgtgtgtt 
3 061 taagcttagg aatgcccttt atggaaaggg ctatgtggga gagtcagcta tcttgtctgg 
3121 ttttcttgag acctcagatg tgtgttcagc agggctgaaa gcttttattc tttaataatg 
3181 agaaatgtat attttactaa taaattattg accgagttct gtagattctt gttaga (SEQ 

ID NO: 5) 
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ADAM 8 (NM__001109) 



MRGLGLWLLGAMMLPAIAPSRPWALMEQYEWLPRRLPGPRVRR 

ALPSHLGLHPERVSYVLGATGHNFTLHLRKNRDLLGSGYTETYTAANGSEVTEQPRGQ 
DHCLYQGHVEGYPDSAASLSTCAGLRGFFQVGSDLHLIEPLDEGGEGGRHAVYQAEHL 
LQTAGTCGVSDDSLGSLLGPRTAAVFRPRPGDSLPSRETRYVELYVWDNAEFQMLGS 
EAAVRHRVLEVVNHVDKLYQKLNFRVVLVGLE I WNSQDRFHVS PDPS VTLENLLTWQA 
RQRTRRHLHDNVQL I TGVDFTGTTVGFARVSAMCSHS SGAVNQDHS KNPVGVACTMAH 
EMGHNLGMDHDENVQGCRCQERFEAGRC I MAGS I GS S FPRMFSDC SQAYLE S FLERPQ 
SVCLANAPDLSHLVGGPVCGNLFVERGEQCDCGPPEDCRNRCCNSTTCQLAEGAQCAH 
GTCCQECKVKPAGELCRPKKDMCDLEEFCDGRHPECPEDAFQENGTPCSGGYCYNGAC 
PTLAQQCQAFWGPGGQAAEESCFSYDILPGCKASRYRADMCGVLQCKGGQQPLGRAIC 
IVDVCHALTTEDGTAYEPVPEGTRCGPEKVCWKGRCQDLHVYRSSNCSAQCHNHGVCN 
HKQECHCHAGWAPPHCAKLLTEVHAASGSLPVLVVWLVLLAWLVTLAGIIVYRKAR 
SRILSRNVAPKTTMGRSNPLFHQAASRVPAKGGAPAPSRGPQELVPTTHPGQPARHPA 
SSVALKRPPPAPPVTVSSPPFPVPVYTRQAPKQVIKPTFAPPVPPVKPGAGAANPGPA 
EGAVGPKVALKPP I QRKQGAGAPTAP (SEQ ID NO: 6) 
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PRSS8 (Prostasin precursor, serine protease, NM__002773) 



1 gactttggtg gcaagaggag ctggcggagc ccagccagtg ggcggggcca ggggaggggc 
61 gggcaggtag gtgcagccac tcctgggagg accctgcgtg gccagacggt gctggtgact 
121 cgtccacact gctcgcttcg gatactccag gcgtctcccg ttgcggccgc tccctgcctt 
181 agaggccagc cttggacact tgctgcccct ttccagcccg gattctggga tccttccctc 
241 tgagccaaca tctgggtcct gccttcgaca ccaccccaag gcttcctacc ttgcgtgcct 
3 01 ggagtctgcc ccaggggccc ttgtcctggg ccatggccca gaagggggtc ctggggcctg 
361 ggcagctggg ggctgtggcc attctgctct atcttggatt actccggtcg gggacaggag 
421 cggaaggggc agaagctccc tgcggtgtgg ccccccaagc acgcatcaca ggtggcagca 
481 gtgcagtcgc cggtcagtgg ccctggcagg tcagcatcac ctatgaaggc gtccatgtgt 
541 gtggtggctc tctcgtgtct gagcagtggg tgctgtcagc tgctcactgc ttccccagcg 
601 agcaccacaa ggaagcctat gaggtcaagc tgggggccca ccagctagac tcctactccg 
661 aggacgccaa ggtcagcacc ctgaaggaca tcatccccca ccccagctac ctccaggagg 
721 gctcccaggg cgacattgca ctcctccaac tcagcagacc catcaccttc tcccgctaca 
781 tccggcccat ctgcctccct gcagccaacg cctccttccc caacggcctc cactgcactg 
841 tcactggctg gggtcatgtg gccccctcag tgagcctcct gacgcccaag ccactgcagc 
901 aactcgaggt gcctctgatc agtcgtgaga cgtgtaactg cctgtacaac atcgacgcca 
961 agcctgagga gccgcacttt gtccaagagg acatggtgtg tgctggctat gtggaggggg 
1021 gcaaggacgc ctgccagggt gactctgggg gcccactctc ctgccctgtg gagggtctct 
1081 ggtacctgac gggcattgtg agctggggag atgcctgtgg ggcccgcaac aggcctggtg 
1141 tgtacactct ggcctccagc tatgcctcct ggatccaaag caaggtgaca gaactccagc 

12 01 ctcgtgtggt gccccaaacc caggagtccc agcccgacag caacctctgt ggcagccacc 
1261 tggccttcag ctctgcccca gcccagggct tgctgaggcc catccttttc ctgcctctgg 
1321 gcctggctct gggcctcctc tccccatggc tcagcgagca ctgagctggc cctacttcca 

13 81 ggatggatgc atcacactca aggacaggag cctggtcctt ccctgatggc ctttggaccc 
1441 agggcctgac ttgagccact ccttccttca ggactctgcg ggaggctggg gccccatctt 
1501 gatctttgag cccattcttc tgggtgtgct ttttgggacc atcactgaga gtcaggagtt 
1561 ttactgcctg tagcaatggc cagagcctct ggcccctcac ccaccatgga ccagcccatt 
1621 ggccgagctc ctggggagct cctgggaccc ttggctatga aaatgagccc tggctcccac 
1681 ctgtttctgg aagactgctc ccggcccgcc tgcccagact gatgagcaca tctctctgcc 
1741 ctctccctgt gttctgggct ggggccacct ttgtgcagct tcgaggacag gaaaggcccc 
1801 aatcttgccc actggccgct gagcgccccc gagccctgac tcctggactc cggaggactg 
1861 agcccccacc ggaactgggc tggcgcttgg atctggggtg ggagtaacag ggcagaaatg 
1921 attaaaatgt ttgagcac (SEQ ID NO: 7) 
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PRSS8 (Prostasin precursor, serine protease, NM_002773) 



MAQKGVLGPGQLGAVAILLYLGLLRSGTGAEGAEAPCGVAPQAR 

I TGGSSAVAGQWPWQVS ITYEGVHVCGGSLVSEQWVLSAAHCFPSEHHKEAYEVKLGA 

QLDSYSEDAKVSTLKDIIPHPSYLQEGSQGDIALLQLSRPITFSRYIRPICLPAANA 

SFPNGLHCTVTGWGHVAPSVSLLTPKPLQQLEVPLISRETCNCLYNIDAKPEEPHFVQ 

EDMVCAGYVEGGKDACQGDSGGPLSCPVEGLWYLTGIVSWGDACGARNRPGVYTLASS 

YASWIQSKVTELQPRWPQTQESQPDSNLCGSHLAFSSAPAQGLLRPILFLPLGIlALG 

LLSPWLSEH (SEQ ID NO: 8) 
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AXOl (Axonin-1 precursor, NM_G 0 5076) 



1 acacacacgc gccctcaccc gccaccgccg ccgcggccgc cgccgcaccc ggacagcgag 
61 cggctgaggc cgccagggcc caaaggacag cggcccagac aggggctggc ggcccggccg 
121 gccccggctc accgactcgg gcagcatcca cctgccccag ccaacaccct tctctcgccc 
181 caggtccttt ctcagcctcc agctgggctg tccccaagct gagctgaggc tcttctcctc 
241 cgatccccac ctctgcccgg acatccacca tggggacagc caccaggagg aagccacacc 
3 01 tgctgctggt agctgctgtg gcccttgtct cctcttcagc ttggagttca gccctgggat 
3 61 cccaaaccac cttcgggcct gtctttgaag accagcccct cagtgtgcta ttcccagagg 
421 agtccacgga ggagcaggtg ttgctggcat gccgcgcccg ggccagccct ccagccacct 
481 atcggtggaa gatgaatggt accgagatga agctggagcc aggttcccgt caccagctgg 
541 tggggggcaa cctggtcatc atgaacccca ccaaggcaca ggatgccggg gtctaccagt 
601 gcctggcctc caacccagtg ggcaccgttg tcagcaggga ggccatcctc cgcttcggct 
661 ttctgcagga attctccaag gaggagcgag acccagtgaa agctcatgaa ggctgggggg 
721 tgatgttgcc ctgtaaccca cctgcccact acccaggctt gtcctaccgc tggctcctca 
7 81 acgagttccc caacttcatc ccgacggacg ggcgtcactt cgtgtcccag accacaggga 
841 acctgtacat tgcccgaacc aatgcctcag acctgggcaa ctactcctgt ttggccacca 
9 01 gccacatgga cttctccacc aagagcgtct tcagcaagtt tgctcagctc aacctggctg 
961 ctgaagatac ccggctcttt gcacccagca tcaaggcccg gttcccagca gagacctatg 
1021 cactggtggg gcagcaggtc accctggagt gcttcgcctt tgggaaccct gtcccccgga 
1081 tcaagtggcg caaagtggac ggctccctgt ccccgcagtg gaccacagct gagcccaccc 
1141 tgcagatccc cagcgtcagc tttgaggatg agggcaccta cgagtgtgag gcggagaact 
12 01 ccaagggccg agacaccgtg cagggccgca tcatcgtgca ggctcagcct gagtggctaa 

12 61 aagtgatctc ggacacagag gctgacattg gctccaacct gcgttggggc tgtgcagccg 
1321 ccggcaagcc ccggcctaca gtgcgctggc tgcggaacgg ggagcctctg gcctcccaga 

13 81 accgggtgga ggtgttggct ggggacctgc ggttctccaa gctgagcctg gaagactcgg 
1441 gcatgtacca gtgtgtggca gagaataagc acggtaccat ctacgccagc gccgagctag 
1501 ccgtgcaagc actcgcccct gacttcaggc tgaatcccgt gaggcgtctg atccccgcgg 
1561 cccgcggggg agagatcctt atcccctgcc agccccgggc agctccaaag gccgtggtgc 
1621 tctggagcaa aggcacggag attttggtca acagcagcag agtgactgta actccagatg 
16 81 gcaccttgat cataagaaac atcagccggt cagatgaagg caaatacacc tgctttgctg 
1741 agaacttcat gggcaaagcc aacagcactg gaatcctatc tgtgcgagat gcaaccaaaa 
1801 tcactctagc cccctcaagt gccgacatca acttgggtga caacctgacc ctacagtgcc 
1861 atgcctccca cgaccccacc atggacctca ccttcacctg gaccctggac gacttcccca 
1921 tcgactttga taagcctgga gggcactacc ggagaactaa tgtgaaggag accattgggg 
1981 atctgaccat cctgaacgcc cagctgcgcc atggggggaa gtacacgtgc atggcccaga 
2 041 cggtggtgga cagcgcgtcc aaggaggcca cagtcctggt ccgaggtccg ccaggtcccc 
2101 caggaggtgt ggtggtgagg gacattggcg acaccaccat ccagctcagc tggagccgtg 
2161 gcttcgacaa ccacagcccc atcgctaagt acaccctgca agctcgcact ccacctgcag 
2221 ggaagtggaa gcaggttcgg accaatcctg caaacatcga gggcaatgcc gagactgcac 
22 81 aggtgctggg cctcaccccc tggatggact atgagttccg ggtcatagcc agcaacattc 
2341 tgggcactgg ggagcctagt gggccctcca gcaaaatccg gaccagggaa gcagccccct 
24 01 cggtggcacc ctcaggactc agcggaggag gtggagcccc cggagagctc atcgtcaact 
2461 ggacgcccat gtcacgggag taccagaacg gagacggctt cggctacctg ctgtccttcc 
2521 gcaggcaggg cagcactcac tggcagaccg cccgggtgcc tggcgccgat gcccagtact 
2581 ttgtctacag caacgagagc gtccggccct acacgccctt tgaggtcaag atccgcagct 
2 641 acaaccgccg cggggatggg cccgagagcc tcactgcact cgtgtactca gctgaggaag 

2 701 agcccagggt ggcccctacc aaggtgtggg ccaaaggggt ctcatcctca gagatgaacg 
2761 tgacctggga acccgtgcag caggacatga atggtatcct cctggggtat gagatccgct 
2821 actggaaagc tggggacaaa gaagcagctg cggaccgagt gaggacagca gggctggaca 
2881 ccagtgcccg agtcagcggc ctgcatccca acaccaagta ccatgtgacc gtgagggcct 
2941 acaaccgggc tggcactggg cctgccagcc cttctgccaa cgccacgacc atgaagcccc 
3001 ctccgcggcg acctcctggc aacatctcct ggactttctc aagctctagt cttagcatta 

3 061 agtgggaccc tgtggtccct ttccgaaatg agtctgcagt caccggctat aagatgctgt 
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accagaatga 
tcccagtgcc 
gggatgggat 
agaacatggc 
tgctgatcct 
ctggacgcca 
tgtgccagag 
taggatattt 
aggcaccagg 
tggagggaag 
cagagatggc 
gcaggaacac 
ttcagtctaa 
gcagcagcaa 
gcggctgaga 

ggggggtgat 

cttggtggaa 
tctcagccaa 
ctgggtgact 
cagtccctcc 
ctgcagtcag 
cactcctgcc 
ctgggaaggg 
ggccagatcc 
aataaatggg 
tggaagaagc 
ggtggtccag 
gttaagaact 
ggcactgctg 
tctacttcaa 
aggtgtgggc 
cccctcactc 
agcagcacaa 
agcgggacag 
tttatagtta 
cctgggcagg 
gccttttccc 
tcagctttcc 
agggtgctag 
agcatccctc 
aggggatgct 
gctgcgtcag 
tggcatagga 
cgatagttac 
ggcagacctc 
aacctaacca 
tcctgcagcc 
ctgaggacat 
cttgcttgca 
tgattttcgt 
gcgatcatga 
ggcaagagga 



cttacacctg 
tgaagacatt 
ccctgcagaa 
agtccgccca 
cataggctcc 
cctccgacgg 
agtggctggt 
tatattctgc 
cagtaacttc 
gaacaggccc 
cctctgggac 
cagacatgaa 
ggaagaaggg 
ggaccctgac 
accagcgccc 
actccaggct 
aggggcacca 
cactgccaac 
aaagggcttg 
agggtttggg 
ctcggcctcc 
tgggagggga 
cagaggataa 
gctcccagac 
ccatcctttc 
cttagagctc 
agagggtctg 
cgagtcttcc 
aatggctatg 

ggggttcgga 

agagcttcta 
ttgccccaag 
ctaggaaacc 
gcatcttgaa 
gagctctatt 
tttatgttga 
tgccacagcc 
tggagctggc 
gggctcagct 
ctggcccccc 
gaacaaaacc 
gggaagcagg 
cctaaccagt 
tcacaagtaa 
ctgggagacc 
ctgggcaggc 
tgagatttca 
gcaagcttgt 
gaagactaga 
gttctctgcc 
gaccacagtt 
acagccacaa 



11/115 

actcccacgc 
ggccatgccc 
gtccacatcg 
gcaccacacc 
ctggagctct 
acacagccag 
tttaaatacc 
cgcaggatag 
catgatgaca 
atgggaagaa 
cctatacgga 
caggttgaag 
caagccctgg 
gctgtccccg 
cgatgcctga 
gtttggggtg 
gccttggtct 
ctgaccctgt 
tcttggtggg 
caggagatgg 
ccgacctgca 
atgcagcatt 
atgtggccct 
ggccttggac 
ctgagctctg 
aacttcttca 
ggattcccaa 
acctttctgt 
gcctggctaa 
ttggtgatca 
ccaaacttca 
aaaagaggcc 
ccaaagccca 
gggcatatgt 
ttgttatggt 
tgtttaccca 
aaacccccac 
taatgaaagc 
atacgaccat 
tctggccacg 
tccttccaag 
ggacaggtgt 
gaagctagag 
gtaccttaat 
cacgaagggt 
agaatttgtt 
ggtagagtac 
aaaatgcaac 
ttagatgttt 
cagatgggct 
ctgggttatc 
acaagtactt 



tccacctcac 
tggtacaaat 
tgaggaatgg 
ctggcaccgt 
gatcctggaa 
ccccttcctg 
tactttaaac 
aacccacgca 
ctgacgccta 
gggggtttta 
ctccgccact 
aactggagcg 
gaccaagagc 
ataactccct 
ggctgggagc 
ggagccaaaa 
gagatagtca 
catcccgatt 
gtctcccacc 
ccaatcatgc 
gccccagact 
catgctgtgt 
gcctgctccc 
tgcttgcatt 
ggtatactac 
agcccctcac 
ggtcacacag 
tcaaggctgt 
gaaggtgatt 
tggggattgg 
acatggaggg 
aaagcaagag 
tgctccgaca 
cctcggaagc 
tttttaaact 
ctacaatttt 
tgcaccctac 
ctcctcacct 
tctccctgac 
acttggcctg 
ttttatccaa 
ccagttgctg 
gctacagcca 
gctaatgagg 
ttttagccag 
tgagggatag 
tgactaaggt 
agcctcctgc 
ctcaggatcc 

gggggagttg 

tcctctcata 
taccccacag 



cggcaagaac 
tcggaccaca 
aggcacaagc 
catttcccac 
cccctccctc 
ctgccaaggt 
agtgcccttt 
aggattttct 
tacctgagct 
aaaacatgtc 
tgagagcagt 
aagtgcacac 
tctcccgcct 
aggggctcct 
ctgagcccct 
agagttgaga 
caacccaggt 
gacagcgcca 
cctccaagac 
gcccacctct 
ctgctctccc 
gtcctggtat 
aggtatacct 
tccccggaga 
cagtcacaga 
tttacagatg 
cccagaagag 
ttgtctaccc 
agtcagtagg 
catggctggg 
ctgacttgaa 
cagattccct 

ggtggccctt 

tccgagcctg 
tttaagtcct 
ttaaaaatat 
ccacccaccc 
cttcccaacc 
agggagtcca 
tgcctggttc 
ttcgttcctc 
ggccgaggga 
ctaaacttgc 
tccactaaaa 
ggaaaactga 
aacgacaaca 
ttaataagac 
tagagtgact 
cctcctgcgc 
agagtgtgct 
catcaagccc 
cttagtggcc 



tggatagaaa 
gggcccggag 
atgatggtgg 
tccgtggcga 
tgcgccgcag 
ggcctgacac 
ttgtaggagg 
ttaaattgag 
ctaggctgcc 
ttcaactcag 
cctaggcccg 
ctcaccatcc 
tctccctcga 
gcctgcccaa 
tcagctttga 
ggccagggcc 
gacgatgccc 
cttcaggtgg 
ccattctgca 
ccagtgctgc 
agcactgact 
tgggaggttt 
aggaccacct 
aaaaggggtt 
acgtcagagc 
aggaaatgga 
atggggctgg 
agaggaagga 
gtgtgaaaat 
ttcccgtcca 
gctccctgtc 
aggcaagagc 
cacagggggc 
ttttctgtag 
gctctatttt 
aagctcacat 
ctagcccagg 
cttacaagca 
aacttggcct 
tctatcagaa 
attgcctcgg 
ggagctggtt 
ttcaggccaa 
aggggaggaa 
gccccaggaa 
aaataaatgt 
aataggtgac 
tgtacatgag 
aggggttctc 
tattttcact 
cagaggaggc 
agtaaacacc 
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6241 ctggggacta ggaaaaggaa ccaactgtag gcacctctcc agggcctagg gagacaagtg 

6301 tcctctcttc tgcatacatt tgggctcccc ttacagagcc ctttgccctg gctctctggt 

6361 ccttgttgct ctaacagtcc agatgtacac ccagcctcag ggggaaggca gctctctcca 

6421 gacagagtct cagggcccag caaggtcagg ttatctgctt tcattcaggg caacaaatga 

6481 tacaaatggt gccagggagt ggcaaggcca tgggggtagg tgggggtgtc tttttctttt 

6541 cataaagtaa caacagacga gactgaggtt aaacatcaga aaaaaacctc tggaatgacc 

6601 ttcctcattc caggaggccc tggaataagg aagaggcttc tttctgaggg agctttgagg 

6661 aattttgaca gctgttgaca tgggatttgg gaaaggtgaa gctgtgactg gaggggcagg 

6721 agatggtcca agtgtccatc cagagatgag actcttagaa tcaaagtgtt cagcccagga 

6781 agtcttggag atcccacctt ctgtggccct gcaccttatg ggaagccatt aagggggctc 

6841 atctaggaat tctggttaca gcccagtgct catcccagcg tatgctgcct ctttagggca 

6901 gccccaaggg ccagccagcc tgtactctgg gcaagagccc aaaatggcta ggaatgtttg 

6961 actcccttaa tctcttcccc agctacagag gaatcttttc tctgcctggt ctcagaatgg 

7021 gactgccaac tggctcattg gtgggagaca cagtatcctc aaacctgtgg ccactggcat 

7081 gacagtggtg ctctgtctcc ctgggtgaca cccaccctag gcttcctcct ggatgtgatg 

7141 gggattgcca gagaggctct tagcataaaa ggcattaggt gggcattttt ctgtgtgccc 

72 01 ccaaaaagct ccatggaaac aggcacctgg tagctgcgga acacccgtgg acttgtgtat 

72 61 atggtcatag gctttgggaa gacaggacgt aaaggaaaat gagagaaaca aaatgggtca 
7321 gatagctttg gccacagccc caggcagcct ttggggccta tgacacttag tgcccttaga 

73 81 tgggatacat cttgcctcgg ccccaagact cctccaactt acccgtccca tccagggcct 
7441 gcacagctta gagaggctca cagcttggca aatgctaggg cttcatcaga ccactgactt 
7501 gactcagtgt ttgttaaaat ggaaccactc ccgttggcct actgtttctc tcctgtactt 
75 61 cttgtaatga tagttattta ttgactctgg tagcaggcag ttcttaaata aagatggttt 
7621 ctcaacctgt tggggaaaaa aaaaaaaaaa (SEQ ID NO: 9) 
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AXOl (Axonin-1 precursor, NM__0 0 5 076) 



MGTATRRKPHLLLVAAVALVSSSAWSSALGSQTTFGPVFEDQPL 

SVLFPEESTEEQVLLACRARASPPATYRWKMNGTEMKLEPGSRHQLVGGNLVIlVnSrPTK 
AQDAGVYQCLASNPVGTWSREAILRFGFLQEFSKEERDPVKAHEGWGVMLPCNPPAH 
YPGLSYRWLLNEFPNFIPTDGRHFVSQTTGNLYIARTNASDLGNYSCLATSHMDFSTK 
SVFSKFAQLNLAAEDTRLFAPSIKARFPAETYALVGQQVTLECFAFGNPVPRIKWRKV 
DGSLSPQWTTAEPTLQI PSVSFEDEGTYECEAENSKGRDTVQGRI I VQAQPEWLKVT S 
DTEADIGSNLRWGCAAAGKPRPTVRWLRNGEPLASQNRVEVLAGDLRFSKXiSLEDSGM 
YQ CVAENKHGT I YAS AEL AVQALAPDFRLNPVRRL I P AARGGE I L I P CQPRAAPKAW 
LWSKGTEILVNSSRVTVTPDGTIillRNISRSDEGKYTCFAENFMGKANSTGILSVRDA 
TKITriAPSSADINLGDNLTLQCHASHDPTMDLTFTWTLDDFPIDFDKPGGHYRRTNVK 
E T I GDLT I LNAQL RHGGKYTCMAQT WD S AS KE AT VL VRGP PGP P GGWVRD I GDTT I 
QLSWSRGFDNHS P I AKYTLQARTPPAGKWKQVRTNPANI EGNAETAQVLGLTPWMDYE 
FRVIASNILGTGEPSGPSSKI RTREAAPSVAPSGLSGGGGAPGELIVNWTPMSREYQN 
GDGFG YLL S FRRQGS THWQTARVPGADAQYF VYSNE S VR P YT P F E VKI RS YNRRGDGP 
E SLTALVYS AE E E PRVAPTKVWAKGVS S S EMNVTWE PVQQDMNGI LLGYE I RYWKAGD 
KEAAADRVRTAGLDTS ARVSGLHPNTKYHVTVRAYNRAGTGPAS PSANATTMKP PPRR 
P PGNI SWT FSSSSLSI KWD P WPFRNE S AVTGYKML YQNDLHLT PTLHLTGK2SIW I E I P 
VPEDIGHALVQIRTTGPGGDGIPAEVHIVRNGGTSMMVENMAVRPAPHPGTVISHSVA 
MLILIGSLEL (SEQ ID NO: 10) 
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NROB2 (Nuclear hormone receptor, NM_02196 9) 



1 gagctggaag tgagagcaga tccctaacca tgagcaccag ccaaccaggg gcctgcccat 

61 gccagggagc tgcaagccgc cccgccattc tctacgcact tctgagctcc agcctcaagg 

121 ctgtcccccg accccgtagc cgctgcctat gtaggcagca ccggcccgtc cagctatgtg 

181 cacctcatcg cacctgccgg gaggccttgg atgttctggc caagacagtg gccttcctca 

241 ggaacctgcc atccttctgg cagctgcctc cccaggacca gcggcggctg ctgcagggtt 

3 01 gctggggccc cctcttcctg cttgggttgg cccaagatgc tgtgaccttt gaggtggctg 

3 61 aggccccggt gcccagcata ctcaagaaga ttctgctgga ggagcccagc agcagtggag 
421 gcagtggcca actgccagac agaccccagc cctccctggc tgcggtgcag tggcttcaat 

4 81 gctgtctgga gtccttctgg agcctggagc ttagccccaa ggaatatgcc tgcctgaaag 
541 ggaccatcct cttcaacccc gatgtgccag gcctccaagc cgcctcccac attgggcacc 
6 01 tgcagcagga ggctcactgg gtgctgtgtg aagtcctgga accctggtgc ccagcagccc 
661 aaggccgcct gacccgtgtc ctcctcacgg cctccaccct caagtccatt ccgaccagcc 
721 tgcttgggga cctcttcttt cgccctatca ttggagatgt tgacatcgct ggccttcttg 
781 gggacatgct tttgctcagg tgacctgttc cagcccaggc agagatcagg tgggcagagg 
841 ctggcagtgc tgattcagcc tggccatccc cagaggtgac ccaatgctcc tggaggggca 
901 agcctgtata gacagcactt ggctccttag gaacagctct tcactcagcc acaccccaca 
961 ttggacttcc ttggtttgga cacagtgctc cagctgcctg ggaggctttt ggtggtcccc 

1021 acagcctctg ggccaagact cctgtccctt cttgggatga gaatgaaagc ttaggctgct 

10 81 tattggacca gaagtcctat cgactttata cagaactgaa ttaagttatt gatttttgta 

1141 ataaaaggta tgaaacacta aaaaaaaa (SEQ ID NO: 11) 



FIGURE 9A 



NROB2 (Nuclear hormone receptor, NM_02196 9) 



MSTSQPGACPCQGAASRPAILYALLSSSLKAVPRPRSRCLCRQH 

RPVQLCAPHRTCREALDVLAKTVAFLRNLPSFWQLPPQDQRRLLQGCWGPLFLLGLAQ 
DAVTFEVAEAPVPSILKKILLEEPSSSGGSGQLPDRPQPSIiAAVQWLQCCLESFWSLE 
LSPKEYACLKGTILFNPDVPGLQAASHIGHLQQEAHWVLCEVLEPWCPAAQGRLTRVL 
LTASTLKSIPTSLLGDLFFRPIIGDVDIAGLLGDMLLLR (SEQ ID NO: 12) 
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TM7SF1 (NM_003272) 

1 cggcgcgatg cgcggagacc cccgcggggg cggcggcggc cgtgagcccc 
61 gagcgtcccc ggccgcgcgg cagcgccccc ggcccgatgg agaccccgcc 
121 gcccgcaacg actcgctgcc gcccacgctg accccggccg tgccccccta 
181 ggcctcaccg tcgtctacac cgtgttctac gcgctgctct tcgtgttcat 
241 ctctggctgg tgctgcgtta ccgccacaag cggctcagct accagagcgt 
3 01 ctctgcctct tctgggcctc cctgcggacc gtcctcttct ccttctactt 
3 61 gtggcggcca attcgctcag ccccttcgtc ttctggctgc tctactgctt 
421 ctgcagtttt tcaccctcac gctgatgaac ttgtacttca cgcaggtgat 
481 aagtcaaaat attctccaga attactcaaa taccggttgc ccctctacct 
541 ttcatcagcc ttgttttcct gttggtgaat ttaacctgtg ctgtgctggt 
6 01 aattgggaga ggaaggttat cgtctctgtg cgagtggcca ttaatgacac 
661 ctgtgtgccg tctctctctc catctgtctc tacaaaatct ctaagatgtc 
721 atttacttgg agtccaaggg ctcctccgtg tgtcaagtga ctgccatcgg 
781 atactgcttt acacctctcg ggcctgctac aacctgttca tcctgtcatt 
841 aagagcgtcc attcctttga ttatgactgg tacaatgtat cagaccaggc 
901 aatcagctgg gagatgctgg atacgtatta tttggagtgg tgttatttgt 
961 ttacctacca ccttagtcgt ttatttcttc cgagttagaa atcctacaaa 
1021 aaccctggaa tggtccccag ccatggattc agtcccagat cttatttctt 
1081 cgaagatatg acagtgatga tgaccttgcc tggaacattg cccctcaggg 
1141 ggttttgctc cagattacta tgattgggga caacaaacta acagcttcct 
12 01 ggaactttgc aagactcaac tttggatcct gacaaaccaa gccttgggta 

12 61 acagttttat ggacgattcc tcagatgaaa agcttcagaa aagcatagtg 
1321 ttttagggca cttttcctta agaaatagaa cttgattttt atttgttaca 

13 81 ggccccatag gaataagcaa taatgtagac tgataaaccc ttattttagt 
1441 gagccttgct atttcagtgg gtataattta aactttttaa agaaaatctg 
15 01 aagatgtatt ttgtataact taaataataa tgctaaagta tactagggtt 
1561 gagaatgtta ctgcaatcat gttgtagttt gcacagactt ttatgcataa 
1621 aaatatagaa tatatggtct aatagttttt taaagctttt ggactaaagt 
1681 tcttacctct ttaggtcact gatggtcact ccgattctga gtgccacatt 
1741 taaaatacag ttgacaactt agccaattgc aactccagtg ttgataatta 
1801 gtaaagcagc agactgtaag gtctttagag attttttttt aaggttcagg 
1861 ctcaaggaat ctcttaagtt ttgcccaaag actggtactt cctttcagta 
1921 gtatacacat taatgataag ttgataacat taaaaatgta gctgacttat 
1981 ctcctctgct atgttcac (SEQ ID NO: 13) 



gatgaggccc 
gtgggaccca 
cgtgaagctt 
ctacgtgcag 
cttcctcttt 
caaagacttc 
ccctgtgtgc 
tttcaaagcc 
ggcctccctc 
aaagacggga 
gctcttcgtg 
cttagccaac 
tgtcaccgtg 
ttctcagaac 
agatttgaag 
ttgggaactc 
ggaccttacc 
tgacaaccct 
acttcaggga 
ggcacaagca 
gcatcagtta 
acagctgaat 
ggtttccaat 
actaaagagg 
tacttttata 
tttttttctt 
ttcactttaa 
attccacaaa 
ggtagactcc 
aaatgaaatg 
ccgtaggttc 
gggcgctaat 
cctattaaac 



FIGURE 10A 



TM7SF1 (NM_003272) 
MRPERPRPRGSAPGPMETPPWDPARNDSLPPTLTPAVPPYVKLG 

LTWYTVFYALLFVFIYVQLWLVLRYRHKRLSYQSVFLFLCLFWASLRTVLFSFYFKD 
FVAANSLSPFVFWLLYCFPVCLQFFTLTLMNLYFTQVIFKAKSKYSPELLKYRLPLYL 
ASLFISLVFLLWLTCAVLVKTGNWERKVIVSVRVAINDTLFVLCAVSLSICLYKISK 
MSLANIYLESKGSSVCQVTAIGVTVILLYTSRACYNLFILSFSQNKSVHSFDYDWYNV 
SDQADLK3STQLGDAGYVLFGWLFWELLPTTLVVYFFRVRNPTK1)LTNPGMVPSHGFS 
PRSYFFDNPRRYDSDDDLAWNIAPQGLQGGFAPDYYDWGQQTNSFLAQAGTLQDSTLD 
PDKPSLG (SEQ ID NO: 14) 
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DLDH (dihydrolipamide dehydrogenase, NM_0 0 010 8) 

1 gcgcagggag gggagacctt ggcggacggc ggagccccag cggaggtgaa agtattggcg 
61 gaaaggaaaa tacagcggaa aaatgcagag ctggagtcgt gtgtactgct ccttggccaa 
121 gagaggccat ttcaatcgaa tatctcatgg cctacaggga ctttctgcag tgcctctgag 
181 aacttacgca gatcagccga ttgatgctga tgtaacagtt ataggttctg gtcctggagg 
241 atatgttgct gctattaaag ctgcccagtt aggcttcaag acagtctgca ttgagaaaaa 
3 01 tgaaacactt ggtggaacat gcttgaatgt tggttgtatt ccttctaagg ctttattgaa 
3 61 caactctcat tattaccata tggcccatgg aacagatttt gcatctagag gaattgaaat 
421 gtccgaagtt cgcttgaatt tagacaagat gatggagcag aagagtactg cagtaaaagc 
481 tttaacaggt ggaattgccc acttattcaa acagaataag gttgttcatg tcaatggata 
541 tggaaagata actggcaaaa atcaagtcac tgctacgaaa gctgatggcg gcactcaggt 
601 tattgataca aagaacattc ttatagccac gggttcagaa gttactcctt ttcctggaat 
661 cacgatagat gaagatacaa tagtgtcatc tacaggtgct ttatctttaa aaaaagttcc 
721 agaaaagatg gttgttattg gtgcaggagt aataggtgta gaattgggtt cagtttggca 
781 aagacttggt gcagatgtga cagcagttga atttttaggt catgtaggtg gagttggaat 
841 tgatatggag atatctaaaa actttcaacg catccttcaa aaacaggggt ttaaatttaa 
901 attgaataca aaggttactg gtgctaccaa gaagtcagat ggaaaaattg atgtttctat 
961 tgaagctgct tctggtggta aagctgaagt tatcacttgt gatgtactct tggtttgcat 
1021 tggccgacga ccctttacta agaatttggg actagaagag ctgggaattg aactagatcc 
1081 tagaggtaga attccagtca ataccagatt tcaaactaaa attccaaata tctatgccat 
1141 tggtgatgta gttgctggtc caatgctggc tcacaaagca gaggatgaag gcattatctg 
12 01 tgttgaagga atggctggtg gtgctgtgca cattgactac aattgtgtgc catcagtgat 

12 61 ttacacacac cctgaagttg cttgggttgg caaatcagaa gagcagttga aagaagaggg 
1321 tattgagtac aaagttggga aattcccatt tgctgctaac agcagagcta agacaaatgc 

13 81 tgacacagat ggcatggtga agatccttgg gcagaaatcg acagacagag tactgggagc 
1441 acatattctt ggaccaggtg ctggagaaat ggtaaatgaa gctgctcttg ctttggaata 
1501 tggagcatcc tgtgaagata tagctagagt ctgtcatgca catccgacct tatcagaagc 
1561 ttttagagaa gcaaatcttg ctgcgtcatt tggcaaatca atcaactttt gaattagaag 
1621 attatatatt tttttttctg aaatttcctg ggagcttttg tagaagtcac attcctgaac 
1681 aggatattct cacagctcca agaatttcta ggactgaatt atgaaacttt tggaaggtat 
1741 ttaataggtt tggacaaaat ggaatactct tatatctata ttttacataa atttagtatt 
1801 ttgtttcagt gcactaatat gtaagacaaa aaggactact tattgtagtc atcctggaat 
1861 atctccgtca actcatattt tcatgctgtt catgaaagat tcaatgcccc tgaatttaaa 
1921 tagctctttt ctctgataca gaaaagttga attttacatg gctggagcta gaatttgata 
19 81 tgtgaacagt tgtgtttgaa gcacagtgat caagttattt ttaatttggt tttcacattg 
2 041 gaaacaagtc agtcattcag atatgattca aatgtctata aaccaaactg atgtaagtaa 
2101 atggtctctc acttgtttta tttaacctct aaattctttc attttagggg tagcatttgt 
2161 gttgaagagg ttttaaagct tccattgttg tctgcaactc tgaagggtaa ttatatagtt 
2221 acccaaatta agagagtcta tttacggaac tcaaatacgt gggcattcaa atgtattaca 
2281 gtggggaatg aagatactga aataaacgtc ttaaatattc (SEQ ID NO: 15) 

FIGURE 11A 

DLDH (dihydrolipamide dehydrogenase, NM_0 0 010 8) 

MQSWSRVYCSLAKRGHFNRISHGLQGLSAVPLRTYADQPIDADV 

TVI GS GPGGYVAAI KAAQLGFKTVC I EKNETLGGTCLNVGC I PS KALLNNSH YYHMAH 
GTDFASRGIEMSEVRLNLDKMMEQKSTAVKALTGGIAHLFKQNKVVHVNGYGKITGKN 
QVTATKADGGTQVIDTKNILIATGSEVTPFPGITIDEDTIVSSTGALSLKKVPEKMW 
I GAGVI GVE LGS VWQRLGADVTAVE FLGHVGGVGI DME I S KNFQR I LQKQGFKFKLNT 
KVTGATKKSDGKIDVSIEAASGGKAEVITCDVLLVCIGRRPFTKNLGLEELGIELDPR 
GRI PVNTRFQTKI PNI YAI GDWAGPMLAHKAEDEGI I CVEGMAGGAVHI DYNCVPS V 
I YTHPEVAWVGKSEEQLKEEGI EYKVGKFPFAANSRAKTNADTDGMVKI LGQKSTDRV 
LGAHILGPGAGEMVNEAALALEYGASCEDIARVCHAHPTLSEAFREANLAASFGKSIN 
F (SEQ ID NO: 16) 

FIGURE 11B 
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MAT2B (methionine adenosyl transferase II, beta, NM__013 2 83 ) 

1 gttctgggcc taggggaggc gggccgaggg cgtctgagct gaggcccgcg tcgatcctgg 
61 gttggaggag gtggcggccg ctgaggctgc ggcgtgaaga cggcgggcat ggt ggggcgg 
121 gagaaagagc tctctataca ctttgttccc gggagctgtc ggctggtgga ggaggaagtt 
181 aacatcccta ataggagggt tctggttact ggtgccactg ggcttcttgg cagagctgta 
241 cacaaagaat ttcagcagaa taattggcat gcagttggct gtggtttcag aagagcaaga 
3 01 ccaaaatttg aacaggttaa tctgttggat tctaatgcag ttcatcacat cattcatgat 
361 tttcagcccc atgttatagt acattgtgca gcagagagaa gaccagatgt tgtagaaaat 
421 cagccagatg ctgcctctca acttaatgtg gatgcttctg ggaatttagc aaaggaagca 
481 gctgctgttg gagcatttct catctacatt agctcagatt atgtatttga tggaacaaat 
541 ccaccttaca gagaggaaga cataccagct cccctaaatt tgtatggcaa aacaaaatta 
601 gatggagaaa aggctgtcct ggagaacaat ctaggagctg ctgttttgag gattcctatt 
661 ctgtatgggg aagttgaaaa gctcgaagaa agtgctgtga ctgttatgtt tgataaagtg 
721 cagttcagca acaagtcagc aaacatggat cactggcagc agaggttccc cacacatgtc 
781 aaagatgtgg ccactgtgtg ccggcagcta gcagagaaga gaatgctgga tccatcaatt 
841 aagggaacct ttcactggtc tggcaatgaa cagatgacta agtatgaaat ggcatgtgca 
901 attgcagatg ccttcaacct ccccagcagt cacttaagac ctattactga cagccctgtc 
961 ctaggagcac aacgtccgag aaatgctcag cttgactgct ccaaattgga gaccttgggc 
1021 attggccaac gaacaccatt tcgaattgga atcaaagaat cactttggcc tttcctcatt 
1081 gacaagagat ggagacaaac ggtctttcat tagtttattt gtgttgggtt cttttttttt 
1141 tttaaatgaa aagtatagta tgtggcactt tttaaagaac aaaggaaata gttttgtatg 

12 01 agtactttaa ttgtgactct taggatcttt caggtaaatg atgctcttgc actagtgaaa 
1261 ttgtctaaag aaactaaagg gcagtcatgc cctgtttgca gtaatttttc tttttatcat 
1321 tttgtttgtc ctggctaaac ttggagtttg agtatagtaa attatgatcc ttaaatattt 

13 81 gagagtcagg atgaagcaga tctgctgtag actthtcaga tgaaattgtt cattctcgta 
1441 acctccatat tttcaggatt tttgaagctg ttgacctttt catgttgatt attttaaatt 
1501 gtgtgaaata gtataaaaat cattggtgtt cattatttgc tttgcctgag ctcagatcaa 
1561 aatgtttgaa gaaaggaact ttatttttgc aagttacgta cagtttttat gcttgagata 
1621 tttcaacatg ttatgtatat tggaacttct acagcttgat gcctcctgct tttatagcag 
1681 tttatgggga gcacttgaaa gagcgtgtgt acatgtattt tttttctagg caaacattga 
1741 atgcaaacgt gtattttttt aatataaata tataactgtc cttttcatcc catgttgccg 
18 01 ctaagtgata tttcatatgt gtggttatac tcataataat gggccttgta agtcttttca 
1861 ccattcatga ataataataa atatgtactg ctggcatgta atgcttagtt ttcttgtatt 
1921 tacttctttt tttaaatgta aggaccaaac ttctaaacta attgttcttt tgttgcttta 
1981 atttttaaaa attacattct tctgatgtaa catgtgatac atacaaaaga atatagttta 
2 041 atatgtattg aaataaaaca caataaaatt aaaaaaaaaa aaaaaaaaaa {SEQ ID 

MO:17) 

FIGURE 12A 

MAT2B (methionine adenosyltransf erase II, beta, NM___013283) 

MVGREKELSIHFVPGSCRLVEEEVWIPWRRVLVTGATGLLGRAV 

HKEFQQNNWHAVGCGFRRARPKFEQWLLDSNAVHHIIHDFQPHVIVHCAAERRPDW 
EMQPDAASQLNVDASGNLAKEAAAVGAFLIYISSDYVFDGTNPPYREEDIPAPLNLYG 
KTKXjDGEKAVLElSnSTLGAAVL 

RFPTHVKDVATVCRQLAEKRMLDPSIKGTFHWSGNEQMTKYEMACAIADAFNLPSSHL 
RPITDSPATLGAQRPRNAQLDCSKLETLGIGQRTPFRIGIKESLWPFLIDKRWRQTVFH (SEQ ID 

NO: 18) 



FIGURE 12B 
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STC-2 (stanniocalcin-2, NM_003714) 

1 gaggaggagg gaaaaggcga gcaaaaagga agagtgggag gaggagggga agcggcgaag 

61 gaggaagagg aggaggagga agaggggagc acaaaggatc caggtctccc gacgggaggt 

121 taataccaag aaccatgtgt gccgagcggc tgggccagtt catgaccctg gctttggtgt 

181 tggccacctt tgacccggcg cgggggaccg acgccaccaa cccacccgag ggtccccaag 

241 acaggagctc ccagcagaaa ggccgcctgt ccctgcagaa tacagcggag atccagcact 

3 01 gtttggtcaa cgctggcgat gtggggtgtg gcgtgtttga atgtttcgag aacaactctt 

3 61 gtgagattcg gggcttacat gggatttgca tgacttttct gcacaacgct ggaaaatttg 
421 atgcccaggg caagtcattc atcaaagacg ccttgaaatg taaggcccac gctctgcggc 

4 81 acaggttcgg ctgcataagc cggaagtgcc cggccatcag ggaaatggtg tcccagttgc 
541 agcgggaatg ctacctcaag cacgacctgt gcgcggctgc ccaggagaac acccgggtga 
6 01 tagtggagat gatccatttc aaggacttgc tgctgcacga accctacgtg gacctcgtga 
661 acttgctgct gacctgtggg gaggaggtga aggaggccat cacccacagc gtgcaggttc 
721 agtgtgagca gaactgggga agcctgtgct ccatcttgag cttctgcacc tcggccatcc 
781 agaagcctcc cacggcgccc cccgagcgcc agccccaggt ggacagaacc aagctctcca 
841 gggcccacca cggggaagca ggacatcacc tcccagagcc cagcagtagg gagactggcc 
901 gaggtgccaa gggtgagcga ggtagcaaga gccacccaaa cgcccatgcc cgaggcagag 
961 tcgggggcct tggggctcag ggaccttccg gaagcagcga gtgggaagac gaacagtctg 

1021 agtattctga tatccggagg tgaaatgaaa ggcctggcca cgaaatcttt cctccacgcc 
10 81 gtccattttc ttatctatgg acattccaaa acatttacca ttagagaggg gggatgtcac 
1141 acgcaggatt ctgtggggac tgtggacttc atcgaggtgt gtgttcgcgg aacggacagg 

12 01 tgagatggag acccctgggg ccgtggggtc tcaggggtgc ctggtgaatt ctgcacttac 
1261 acgtactcaa gggagcgcgc ccgcgttatc ctcgtacctt tgtcttcttt ccatctgtgg 
1321 agtcagtggg tgtcggccgc tctgttgtgg gggaggtgaa ccagggaggg gcagggcaag 

13 81 gcagggcccc cagagctggg ccacacagtg ggtgctgggc ctcgccccga agcttctggt 
1441 gcagcagcct ctggtgctgt ctccgcggaa gtcagggcgg ctggattcca ggacaggagt 
15 01 gaatgtaaaa ataaatatcg cttagaatgc aggagaaggg tggagaggag gcaggggccg 
15 61 agggggtgct tggtgccaaa ctgaaattca gtttcttgtg tggggccttg cggttcagag 
1621 ctcttggcga gggtggaggg aggagtgtca tttctatgtg taatttctga gccattgtac 
1681 tgtctgggct gggggggaca ctgtccaagg gagtggcccc tatgagttta tattttaacc 
1741 actgcttcaa atctcgattt cacttttttt atttatccag ttatatctac atatctgtca 
18 01 tctaaataaa tggctttcaa acaaagcaac tgggtcatta aaaccagctc aaagggggtt 
1861 taaaaaaaaa aaaaccagcc catcctttga ggctgatttt tctttttttt aagttctatt 
1921 ttaaaagcta tcaaacagcg acatagccat acatctgact gcctgacatg gactcctgcc 
1981 cacttggggg aaaccttata cccagaggaa aatacacacc tggggagtac atttgacaaa 
2041 tttcccttag gatttcgtta tctcaccttg accctcagcc aagattggta aagctgcgtc 
2101 ctggcgattc caggagaccc agctggaaac ctggcttctc catgtgaggg gatgggaaag 
2161 gaaagaagag aatgaagact acttagtaat tcccatcagg aaatgctgac cttttacata 
2221 aaatcaagga gactgctgaa aatctctaag ggacaggatt ttccagatcc taattggaaa 
2281 tttagcaata aggagaggag tccaagggga caaataaagg cagagagaga gagagagaga 
2341 gggagaggaa gaaaagagag agagaaaaga gcctcgtgcc (SEQ ID NO: 19) 



FIGURE 13A 



STC-2 (stanniocalcin-2, NM__003714) 

MCAERLGQFMTLALVLATFDPARGTDATNPPEGPQDRSSQQKGR 

LSLQNTAE I QHCLVNAGDVGCGVFECFENNS CE I RGLHGI CMTFLHNAGKFDAQGKS F 
IKDALKCKAHALRHRFGCISRKCPAIREMVSQLQRECYLKHDLCAAAQENTRVIVEMI 
HFKDLLLHEPYVDLWLLLTCGEEVKEAITHSVQVQCEQNWGSLCSILSFCTSAIQKP 
PTAPPERQPQVDRTKLSRAHHGEAGHHLPEPSSRETGRGAKGERGSKSHPNAHARGRV 
GGLGAQGPSGSSEWEDEQSEYSDIRR (SEQ ID NO: 2 0) 
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PPBI (alkaline phosphatase, intestinal precursor, NM_001631) 



1 
61 
121 
181 
241 
301 
361 
421 
481 
541 
601 
661 
721 
781 
841 
901 
961 
1021 
1081 
1141 
1201 
1261 
1321 
1381 
1441 
1501 
1561 
1621 
1681 
1741 
1801 
1861 
1921 
1981 
2041 
2101 
2161 
2221 
2281 
2341 
2401 
2461 
ID NO: 21) 



gttcctggtg 
gtgctgctgc 
gagaacccgg 
cagcccatcc 
cccacggtga 
acgcccctgg 
agacaggtgc 
ttccagacca 
aatgaggtca 
accaccacac 
aactggtact 
gccactcagc 
atgtttccca 
aggctggacg 
gtgtggaacc 
ggcctctttg 
tccctgatgg 
tacctctttg 
gcagtcactg 
gaggaggaca 
tacaccttgc 
gcctacacgt 
ccagacgtga 
ctgtcgtccg 
cacctggtgc 
tgtctggagc 
cacccagttg 
gctgctccct 
ctgttccccg 
cttcacctcc 
atccccttca 
cctcccctca 
atgcggctgc 
tctatcctaa 
acacacccag 
caaggacttg 
acccaaggag 
ctccccacta 
ttgccccaag 
agatccaagg 
ctcccgggac 
tcaaggtctc 



tccccacttc 
tgctgggcct 
ccttctggaa 
agaaggtcgc 
cagccaccag 
ccatggaccg 
cagacagcgc 
tcggcttgag 
tctccgtgat 
gggtgcagca 
cagatgctga 
tcatctccaa 
tggggacccc 
ggaagaacct 
gcactgagct 
agcccggaga 
agatgacaga 
tggagggcgg 
aggcggtcat 
cgctgaccct 
gagggagctc 
ccatcctgta 
atgagagcga 
agacccacgg 
atggtgtgca 
cctacacggc 
ccgcgtcgct 
gagtgcccca 
tcctgagccg 
tagagataaa 
gggagcagga 
ggttgttctc 
ctgcacccca 
ggaagaccaa 
accgcgtgcc 
ggtggatcag 
gctactggat 
ggatcattcc 
tcacagccac 
agcgcttgag 
atctggatgc 
cattctccta 



gcctccctcc 
gaggctacag 
ccgccaggca 
caagaacctc 
gatcctaaag 
cttcccatac 
agccacagcc 
tgcagccgcc 
gaaccgggcc 
cgcctcgcca 
catgcctgcc 
catggacatt 
agaccctgag 
ggtgcaggaa 
catgcaggcg 
cacgaaatat 
ggctgccctg 
ccgcatcgac 
gttcgacgac 
cgtcaccgct 
catcttcggg 
cggcaatggc 
gagcgggagc 
aggcgaagac 
ggagcagagc 
ctgcgacctg 
gccactgctg 
ctccggagtt 
ccacttccag 
ccagcctcag 
gcccagggcg 
tgattcttcc 
gacaataaag 
gcaggcctgg 
ccaccgtctt 
gacacctgaa 
cggggattcc 
acacccctgc 
tcagatgctt 
gagctctggg 
tgggcataga 
ggagacaaag 



tgctgccccc 
ctctccctgg 
gctgaggccc 
atcctcttcc 
gggcagaaga 
ctggctctgt 
acggcctacc 
cgctttaacc 
aagcaagcag 
gccggcacct 
tcagcccgcc 
gacgtgatcc 
tacccagctg 
tggctggcaa 
tccctggacc 
gagatcctcc 
cgcctgctga 
catggtcatc 
gccattgaga 
gaccactccc 
ttggccccca 
ccgggctacg 
cccgattacc 
gtggcggtgt 
ttcgtagcgc 
gcgctccccg 
gccgggaccc 
atcctgctcc 
cgaacacaca 
ctggcgcagc 
ccctgggagc 
tcccaacccc 
ggaccaaaac 
acccagagac 
agcttcaatc 
gaagagaagc 
caggggggct 
acctgaccaa 
cctgcccccc 
tacagggcag 
tttctcaaca 
caataataaa 



aagacatgca 
gcgtcatccc 
tggatgctgc 
tgggcgatgg 
atggcaaact 
ccaagacata 
tgtgcggggt 
agtgcaacac 
gaaagtcagt 
acgcacacac 
aggaggggtg 
ttggcggagg 
atgccagcca 
agcaccaggg 
agtctgtgac 
gagaccccac 
gcaggaaccc 
atgagggtgt 

gggcgggcca 

atgtcttctc 
gcaaggctca 
tgttcaactc 
agcagcaggc 
ttgcgcgcgg 
atgtcatggc 
cctgcaccac 
tgctgctgct 
ccacctccgg 
caggtgtcct 
ggggcccttc 
tgagcctggg 
agagactgca 
cacccaaccc 
gtcccccatc 
ctggcagcac 
ttccggcaac 
ttgacacagt 
gggaccaatg 
agtgcccatt 
caacccagag 
aggaagactc 
aggtgttaga 



ggggccctgg 
agctgaggag 
caagaagctg 
gttgggggtg 
ggggcctgag 
caatgtggac 
caaggccaac 
gacacgcggc 
aggagtggtg 
agtgaaccgc 
ccaggacatc 
ccgcaagtac 
gaatggaatc 
tgcctggtat 
ccatctcatg 
actggacccc 
ccgcggcttc 
ggcttaccag 
gctcaccagc 
ctttggtggc 
ggacagcaaa 
aggcgtgcga 
ggcggtgccc 
cccgcaggcg 
cttcgctgcc 
cgacgccgcg 

gggggcgtcc 

gcgtcctgcc 
gccgttggac 
ttccctccgc 
acttccagga 
gatttgtgcc 
ccaccctgcc 
gtgggacacg 
ctggtagacc 
cctgcaaccc 
cctctgctgt 

a gg ca g a gg c 

ccaggtcacc 
cccatgggcc 
ccctgcctcc 
caatgt (SEQ 



FIGURE 14A 



PPBI (alkaline phosphatase, intestinal precursor, NM_001631) 
MQGPWVLLLLGLRLQLSLGVIPAEEENPAFWNRQAAEALDAAKK 

IiQP I QKVAKNLI LFLGDGLGVPTVTATR I LKGQKNGKLGPETPLAMDRFPYLAIjSKTY 
NVDRQVPDSAATATAYLCGVKANFQTIGLSAAARFNQCNTTRGNEVI SVMNRAKQAGK 
SVGWTTTRVQHAS PAGT YAHTVNRNWYS DADMPAS ARQEGCQD I ATQL I SNMD I DVI 
LGGGRKYMFPMGTPDPEYPADASQNGIRLDGKNLVQEWIAKIiQGAWYVl^RTELMQAS 
LDQSVTHLMGLFEPGDTKYEILRDPTLDPSLMEMTEAALRLLSRNPRGFYLFVEGGRI 
DHGHHEGVAYQAVTEAVMFDDAIERAGQLTSEEDTLTLVTADHSHVFSFGGYTLRGSS 
IFGLAPSKAQDSKAYTSILYGNGPGYVFNSGVRPDVNESESGSPDYQQQAAVPLSSET 
HGGEDVAVFARGPQAHLVHGVQEQSFVAHVMAFAACLEPYTACDLALPACTTDAAHPV 
AASLPLLAGTLLLLGASAAP (SEQ ID NO: 22) 
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SLNAC1 (sodium channel receptor SLNAC1, NM_0 04769) 

1 agaattcggc acgacggggt tctggccatg aagcccacct caggcccaga ggaggcccgg 
61 cggccagcct cggacatccg cgtgttcgcc agcaactgct cgatgcacgg gctgggccac 
121 gtcttcgggc caggcagcct gagcctgcgc cgggggatgt gggcagcggc cgtggtcctg 
181 tcagtggcca ccttcctcta ccaggtggct gagagggtgc gctactacag ggagttccac 
241 caccagactg ccctggatga gcgagaaagc caccggctca tcttcccggc tgtcaccctg 
301 tgcaacatca acccactgcg ccgctcgcgc ctaacgccca acgacctgca ctgggctggg 
3 61 tctgcgctgc tgggcctgga tcccgcagag cacgccgcct tcctgcgcgc cctgggccgg 
421 ccccctgcac cgcccggctt catgcccagt cccacctttg acatggcgca actctatgcc 
481 cgtgctgggc actccctgga tgacatgctg ctggactgtc gcttccgtgg ccaaccttgt 
541 gggcctgaga acttcaccac gatcttcacc cggatgggaa agtgctacac atttaactct 
601 ggcgctgatg gggcagagct gctcaccact actaggggtg gcatgggcaa tgggctggac 
661 atcatgctgg acgtgcagca ggaggaatat ctacctgtgt ggagggacaa tgaggagacc 
721 ccgtttgagg tggggatccg agtgcagatc cacagccagg aggagccgcc catcatcgat 
781 cagctgggct tgggggtgtc cccgggctac cagacctttg tttcttgcca gcagcagcag 
841 ctgagcttcc tgccaccgcc ctggggcgat tgcagttcag catctctgaa ccccaactat 
901 gagccagagc cctctgatcc cctaggctcc cccagcccca gccccagccc tccctatacc 
961 cttatggggt gtcgcctggc ctgcgaaacc cgctacgtgg ctcggaagtg cggctgccga 
1021 atggtgtaca tgccaggcga cgtgccagtg tgcagccccc agcagtacaa gaactgtgcc 
1081 cacccggcca tagatgccat gcttcgcaag gactcgtgcg cctgccccaa cccgtgcgcc 
1141 agcacgcgct acgccaagga gctctccatg gtgcggatcc cgagccgcgc cgccgcgcgc 
12 01 ttcctggccc ggaagctcaa ccgcagcgag gcctacatcg cggagaacgt gctggccctg 

12 61 gacatcttct ttgaggccct caactatgag accgtggagc agaagaaggc ctatgagatg 

13 21 tcagagctgc ttggtgacat tgggggccag atggggctgt tcatcggggc cagcctgctc 
13 81 accatcctcg agatcctaga ctacctctgt gaggtgttcc gagacaaggt cctgggatat 
1441 ttctggaacc gacagcactc ccaaaggcac tccagcacca atctgcttca ggaagggctg 
1501 ggcagccatc gaacccaagt tccccacctc agcctgggcc ccagacctcc cacccctccc 
1561 tgtgccgtca ccaagactct ctccgcctcc caccgcacct gctaccttgt cacacagctc 
1621 tagacctgct gtctgtgtcc tcggagcccc gccctgacat cctggacatg cctagcctgc 
16 81 acgtagcttt tccgtcttca ccccaaataa agtcctaatg catcaaaaaa aaaaaaaaaa 
1741 aaaaaa (SEQ ID NO: 23) 



FIGURE 15A 



SLNACl (sodium channel receptor SLNAC1, NM_0Q4 76 9) 

MKPTSGPEEARRPASDIRVFASNCSMHGLGHVFGPGSLSLRRGM 

WAAAWLSVATFLYQVAERVRYYREFHHQTALDERESHRLIFPAVTLCNINPLRRSRL 
TPNDLHWAGSALLGLDPAEHAAFLRALGRPPAPPGFMPSPTFDMAQLYARAGHSLDDM 
LLDCRFRGQPCGPENFTTIFTRMGKCYTFNSGADGAELLTTTRGGMGNGLDIMIaDVQQ 
EEYLPVWRDNEETPFEVGIRVQIHSQEEPPIIDQLGLGVSPGYQTFVSCQQQQLSFLP 
PPWGDCSSASLNPNYEPEPSDPLGSPSPSPSPPYTLMGCRLACETRYVARKCGCRMVY 
MPGDVPVCSPQQYKNCAHPAIDAMLRKDSCACPNPCASTRYAKELSMVRIPSRAAARF 
LARKLNRS E AYI AENVIALD I FFEALNYET VEQKKAYEMS ELLGD I GGQMGL F I GAS L 
LTILEILDYLCEVFRDKVLGYFWNRQHSQRHSSTNLLQEGLGSHRTQVPHLSLGPRPP 
PPCAVTKTLASHRTCYLVTQL (SEQ ID NO: 24) 
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CAH4 (carbonic anhydrase iv precursor, NM_000717) 



1 ctcggtgcgc gaccccggct cagaggactc tttgctgtcc cgcaagatgc ggatgctgct 
61 ggcgctcctg gccctctccg cggcgcggcc atcggccagt gcagagtcac actggtgcta 
121 cgaggttcaa gccgagtcct ccaactaccc ctgcttggtg ccagtcaagt ggggtggaaa 
181 ctgccagaag gaccgccagt cccccatcaa catcgtcacc accaaggcaa aggtggacaa 
241 aaaactggga cgcttcttct tctctggcta cgataagaag caaacgtgga ctgtccaaaa 
3 01 taacgggcac tcagtgatga tgttgctgga gaacaaggcc agcatttctg gaggaggact 
3 61 gcctgcccca taccaggcca aacagttgca cctgcactgg tccgacttgc catataaggg 
421 ctcggagcac agcctcgatg gggagcactt tgccatggag atgcacatag tacatgagaa 
481 agagaagggg acatcgagga atgtgaaaga ggcccaggac cctgaagacg aaattgcggt 
541 gctggccttt ctggtggagg ctggaaccca ggtgaacgag ggcttccagc cactggtgga 
6 01 ggcactgtct aatatcccca aacctgagat gagcactacg atggcagaga gcagcctgtt 
661 ggacctgctc cccaaggagg agaaactgag gcactacttc cgctacctgg gctcactcac 
721 cacaccgacc tgcgatgaga aggtcgtctg gactgtgttc cgggagccca ttcagcttca 
781 cagagaacag atcctggcat tctctcagaa gctgtactac gacaaggaac agacagtgag 
841 catgaaggac aatgtcaggc ccctgcagca gctggggcag cgcacggtga taaagtccgg 
901 ggccccgggt cggccgctgc cctgggccct gcctgccctg ctgggcccca tgctggcctg 
961 cctgctggcc ggcttcctgc gatgatggct cacttctgca cgcagcctct ctgttgcctc 

1021 agctctccaa gttccaggct tccggtcctt agccttccca ggtgggactt taggcatgat 

1081 taaaatatgg acatattttt ggag (SEQ ID NO: 25) 



FIGURE 16 A 



CAH4 (carbonic anhydrase iv precursor, NM_0 0 0717) 



RMLLALLAL S AARP S AS AE S HWC YE VQAE S SN Y P CL VP VKWGG 

CQKDRQSPINIVTTKAKVDKKLGRFFFSGYDKKQTWTVQNNGHSVMMIiLENKASISG 
GLPAPYQAKQLHLHWSDLPYKGSEHSLDGEHFAMEMHIVHEKEKGTSRNVKEAQDPE 
EIAVLAFLVEAGTQVNEGFQPLVEALSNIPKPEMSTTMAESSLLDLLPKEEKLRHYF 
YLGSIiTTPTCDEKWWTVFREP I QLHREQ I LAF SQKL Y YDKEQTVSMKDNVRPLQQL 
QRTVIKSGAPGRPLPWALPALLGPMLACLLAGFLR (SEQ ID NO: 26) 
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PA21 (phopholipase a2 precursor, NM_0 0 0 92 8) 



1 tggtcatctc agttcttttc tcaccttgac tgcaagatga aactccttgt gctagctgtg 
61 ctgctcacag tggccgccgc cgacagcggc atcagccctc gggccgtgtg gcagttccgc 
121 aaaatgatca agtgcgtgat cccggggagt gaccccttct tggaatacaa caactacggc 
181 tgctactgtg gcttgggggg ctcaggcacc cccgtggatg aactggacaa gtgctgccag 
241 acacatgaca actgctatga ccaggccaag aagctggaca gctgtaaatt tctgctggac 
3 01 aacccgtaca cccacaccta ttcatactcg tgctctggct cggcaatcac ctgtagcagc 
3 61 aaaaacaaag agtgtgaggc cttcatttgc aactgcgacc gcaacgctgc catctgcttt 
421 tcaaaagctc catataacaa ggcacacaag aacctggaca ccaagaagta ttgtcagagt 
481 tgaatatcac ctctcaaaag catcacctct atctgcctca tctcacactg tactctccaa 
541 taaagcacct tgttgaaaga cctcaaaaaa aaaaaaaaaa aaaaa (SEQ ID NO: 27) 



FIGURE 17A 



PA21 (phopholipase a2 precursor, NM_0 0 0 92 8 ) 



KLLVLAVLLTVAAADSGISPRAVWQFRKMIKCVIPGSDPFLEY 

NYGCYCGLGGSGTPVDELDKCCQTHDNCYDQAKKLDSCKFLLDNPYTHTYSYSCSGS 
ITCS SKNKECEAF I CNCDRNAA.I CFSKA.PYNKAHKNLDTKKYCQS (SEQ ID NO: 28) 
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PAR2 (proteinase activated receptor 2 precursor, NM_0 0524 2 ) 



l 

61 
121 
181 
241 
301 
361 
421 
481 
541 
601 
661 
721 
781 
841 
901 
961 
1021 
1081 
1141 
1201 
1261 
1321 
1381 
1441 
1501 
1561 
1621 
1681 
1741 
1801 
1861 
1921 
1981 
2041 
2101 
2161 
2221 
2281 
2341 
2401 
2461 
2521 
2581 
2641 
2701 
2761 
2821 
ID NO:29) 



tgaaacctaa 
ggctgacttt 
tccccgcgcg 
gctgggggcc 
caatagatcc 
tggaaaagga 
cactggaaaa 
tttgccaagt 
tgctgtgatt 
cttgaagatt 
tgtgcttatt 
cagtgtgcag 
cattgccatt 
gtatgtcgtg 
tttgcctgag 
ggtctttctg 
gcgatcttct 
tgtcactgtc 
gcattatttt 
cctctgcctc 
tgatttcagg 
gatgcaagta 
ttcaaccact 
taggatgtgg 
tctcaccaca 
catgagaaaa 
atgaccccag 
ccagtaactt 
tatacacata 
ctcagagatg 
tgccagaatc 
gtccagtgag 
tgcctcatgc 
ggagttcgag 
ttaaccaggt 
attgagtatc 
actccagctt 
tcctcagatt 
aaacctgcat 
catttgacaa 
tgttcagctt 
taaccctctg 
atgtaaatac 
gtaatagatt 
gttggagtat 
cagagggttt 
atgccaaaat 
gtgtaattga 



cccgccctgg 
ctctcggtgc 
cccggcgtcg 
gccatcctgc 
tctaaaggaa 
gttacagttg 
ctgaccactg 
aacggcatgg 
tacatggcca 
gcctatcaca 
ggctttttct 
aggtattggg 
ggcatctccc 
aagcagacca 
cagctcttgg 
ttcccagcct 
gccatggatg 
ctggccatgt 
ctgattaaga 
tctaccctta 
gatcatgcaa 
tccctcacct 
gttaagacct 
aacctgttta 
taccatgtgg 
gtagtccccc 
aaactgaacc 
gcaaaaagta 
tatatatttt 
atcagtccaa 
aggtttccaa 
tgaggttctt 
ctgtaatcct 
accagcctgg 
gtgtggtgca 
actttaactc 
gggtgataaa 
caataatgag 
ggtgtttatg 
agtgccgtga 
ataatgaaat 
agtttttgta 
aaattttgta 
gttttgccac 
ttattgtcag 
ggaccacatc 
gactttatac 
tttataaata 



ggaggcgcgc 
gtccagtgga 
gggcttccag 
tagcagcctc 
gaagccttat 
aaacagtctt 
tcttccttcc 
ccctgtgggt 
atctggcctt 
tacatggcaa 
atggcaacat 
tcatcgtgaa 
tggcaatatg 
tcttcattcc 
tgggagacat 
tcctcacagc 
aaaactcaga 
acctgatctg 
gccagggcca 
acagctgcat 
agaacgctct 
caaagaaaca 
cctattgagt 
atgttatgag 
atgcagcacc 
aaattaacat 
aacagaagca 
gacttggtgt 
acatctggga 
ctgaacgacc 
tcaacagcag 
gtaccacttc 
agcactttgg 
ccatcatggc 
cgtttgtaat 
aggaggcaga 
ataaaataaa 
agctcagact 
cacacagaga 
taatttttga 
ctgtttgttg 
tgtattatta 
taacttttga 
ttagaatagc 
ttttgttcac 
tctttggaaa 
aacgattgta 
acaaaatttt 



agcagaggct 
gctctgagtt 
gaggatgcgg 
tctctcctgc 
tggtaaggtt 
ttctgtggat 
aattgtctac 
ctttcttttc 
ggctgacctc 
caactggatt 
gtactgttcc 
ccccatgggg 
gctgctgatt 
tgccctgaac 
gttcaattac 
ctctgcctat 
gaagaaaagg 
cttcactcct 
gagccatgtc 
cgaccccttt 
cctttgccga 
ctccaggaaa 
tttccaggtc 
gacgtgtctg 
tctcaggatt 
cagtgtctgt 
gacttttcag 
gaagactcac 
tcatgataga 
ttacaaatga 
tgagttggga 
atcaaaatca 
gaggctgagg 
gaaacctcat 
cccagttact 
ggttgcagtg 
atagtcgtga 
gggaacaggg 
tttgagaacc 
aaagagaagc 
acttattagg 
ttaaagaaaa 
tgacttcagt 
atttgccact 
ttgttatcta 
atagtttgca 
tttgtgactt 
ttttacaact 



ccgattcggg 
tcgaatcggc 
agccccagcg 
agtggcacca 
gatggcacat 
gagttttctg 
acaattgtgt 
cgaactaaga 
ctctctgtca 
tatggggaag 
attctcttca 
cactccagga 
ctgctggtca 
atcacgacct 
ttcctctctc 
gtgctgatga 
aagagggcca 
agtaaccttc 
tatgccctgt 
gtctattact 
agtgtccgca 
tccagctctt 
ctcagatggg 
ttatttccta 
gctaggagct 
ttcagaatct 
aagatggtga 
ttctcagctg 
cttgttaggg 
ggaaaccaag 
ttggacagta 
tggatcttgg 
caggcaatca 
ctctactaaa 
caggaggctg 
agccgagatt 
atcttgttca 
cccaggaatc 
attgttctga 
aaacaatggt 
actttgaatt 
atgcaatcag 
gaaattttca 
tagtatttta 
atacaaaatt 
acatatttaa 
ttaaaaataa 
taaaaaaaaa 



gcaggtgaga 
ggcggcggat 
cggcgtggct 
tccaaggaac 
cccacgtcac 
catctgtcct 
ttgtggtggg 
agaagcaccc 
tctggttccc 
ctctttgtaa 
tgacctgcct 
agaaggcaaa 
ccatcccttt 
gtcatgatgt 
tggccattgg 
tcagaatgct 
tcaaactcat 
tgcttgtggt 
acattgtagc 
ttgtttcaca 
ctgtaaagca 
actcttcaag 
aattgcacag 
atcaaaaagg 
cccctgtttg 
ctctactcag 
agacagaaac 
aaattatata 
cttcaaggcc 
ataaatgagc 
gaatttcaat 

ctgggtgcgg 

cttgaggtca 
aatacaaaag 
aggcacaaga 
gcaccactgc 
aaatgcagat 
tgtgtggtac 
atgctgcttc 
gtctctttta 
atttctttat 
gattttaaac 
ggtagtctga 
aaaaataatt 
ataaagcctt 
gagatacttg 
ttattttatt 
aaaaaa (SEQ 
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PAR2 (proteinase activated receptor 2 precursor, NM_005242) 



RSPSAAWLLGAAILLAASLSCSGTIQGTNRSSKGRSLIGKVDG 

SHVTGKGVTVETVFSVDEFSASVLTGKLTTVFLPIVYTIVFWGLPSNGMALWVFLF 
TKKKHPAVIYMANLALADLLSVI^ 

SILFMTCLSVQRYWIWPMGHSRKKANIAIGISLAIWLLILLVTIPLYVVKQTIFI 
ALNITTCHDVLPEQLLVGDMFNYFLSLAIGVFLFPAFLTASAYVLMIRMLRSSAMDE 
S EKKRKRAI KL I VTVLAMYL I CFTPSNLLLWHYFL I KSQGQSHVYALYI VALCLS T 
NSCIDPFVYYFVSHDFRDHAKNAIiLCRSVRTVKQMQVSLTSKKHSRKSSSYSSSSTT 
KTSY (SEQ ID NO: 30) 
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( insulin-degrading enzyme , 



NM 004969) 



l 
61 
121 
181 
241 
301 
361 
421 
481 
541 
601 
661 
721 
781 
841 
901 
961 
1021 
1081 
1141 
1201 
1261 
1321 
1381 
1441 
1501 
1561 
1621 
1681 
1741 
1801 
1861 
1921 
1981 
2041 
2101 
2161 
2221 
2281 
2341 
2401 
2461 
2521 
2581 
2641 
2701 
2761 
2821 
2881 
2941 
3001 
3061 
3121 
3181 
3241 



ccggctcgaa 
cggtaccggc 
ggcgcccgcc 
atgaataatc 
cgagaatatc 
accacggata 
aatattgctg 
cctaaagaaa 
actagtggag 
ctagacaggt 
gaggtgaatg 
tttcaattgg 
aacaaatata 
ctgaaattcc 
gaatctttag 
aatgttccat 
tacaaaatag 
cttcagaaat 
ggtcctggaa 
gggcagaagg 
gaaggattat 
cgtgcagaag 
tttaggttta 
cattattatc 
gacttaatag 
tctaaatctt 
caagaagcta 
tttaaacttc 
aaagaggcga 
aaacaagatg 
tttgcttatg 
gactcactca 
aataccatct 
ctaaagaaga 
atcaaagaag 
gccatgtact 
gaagctctgg 
cggctgcaca 
atgcagatgg 
cagctggttc 
agaaatgaag 
acctcagaga 
accctgcgca 
ggcatacaga 
agagtggaag 
ttccaaaaac 
gctgagtgtg 
aacactgagg 
gaaatgttgg 
gaaatggatt 
caagcaccag 
ctgccactgt 
ttccccatgc 
atcatcttgg 
caaatgtcat 



gcgcaacgag 
tagcgtggct 
tgccgcctcc 
cagccatcaa 
gagggctaga 
agtcatcagc 
gcttaagtca 
atgaatacag 
agcataccaa 
ttgcacagtt 
cagttgattc 
aaaaagctac 
ctctggagac 
attctgctta 
atgacttgac 
tgccagaatt 
tacccattaa 
actacaaatc 
gtctgttatc 
aaggagcccg 
tacatgttga 
gacctcaaga 
aagacaaaga 
ccctagaaga 
agatggttct 
ttgaaggaaa 
taccggatga 
ctacaaagaa 
caccataccc 
ataagaaaaa 
tggacccctt 
acgagtatgc 
atgggatgta 
ttattgagaa 
catatatgcg 
acctccgctt 
atgatgtaac 
ttgaagccct 
ttgaagacac 
ggtatagaga 
ttcacaataa 
atatgtttct 
ccaaggagca 
gcttgagatt 
ctttcttaat 
acattcaggc 
ctaaatactg 
ttgcatattt 
cagtagatgc 
cttgtcctgt 
ccttgccaca 
ttccccttgt 
atgggaaagt 
ccactttaat 
tatgtagaaa 



gaagcgtttg 
tctgcacccc 
ggagcgcctg 
gagaatagga 
gctggccaat 
agcacttgat 
tttttgtgaa 
ccagtttctc 
ttactatttt 
ttttctgtgc 
agaacatgag 
agggaatcct 
tagaccaaac 
ctattcatcc 
taatctggtg 
tcctgaacac 
agatattagg 
aaatcctggt 
agaacttaag 
aggttttatg 
agatataatt 
atgggttttc 
gaggccacgg 
ggtgctcaca 
cgataaactc 
aactgatcgc 
agtcatcaag 
tgaatttatt 
tgctcttatt 
aaagccgaag 
gcactgtaac 
atatgcagca 
tctttcagtg 
aatggctacc 
atctcttaac 
gctgatgact 
ccttcctcgc 
tctccatgga 
cctcattgaa 
agttcagctc 
ctgtggcatc 
ggagctcttc 
gttgggctat 
catcatccag 
taccatggaa 
attagcaatt 
gggagaaatc 
aaagacactt 
tccaaggaga 
tgttggagag 
acctgaagtg 
gaaaccacat 
gcaagtggat 
agtttctgat 
tattataaat 



cggtgatccc 
gcactgccca 
tgtggtttcc 
aatcacatta 
ggtatcaaag 
gtgcacatag 
catatgcttt 
agtgagcatg 
gatgtttctc 
cccttgttcg 
aagaatgtga 
aaacacccct 
caagaaggca 
aacttaatgg 
gtaaagttat 
cctttccaag 
aatctctatg 
cattatcttg 
tcaaagggct 
ttttttatca 
ttgcacatgt 
caagagtgca 
ggctatacat 
gcggaatatt 
agaccagaaa 
acagaagagt 
aaatggcaaa 
cctacgaatt 
aaggatacag 
gcttgtctca 
atggcctatt 
gagctagcag 
aaaggttaca 
tttgagattg 
aatttccggg 
gaagtggcct 
cttaaggcct 
aacataacaa 
catgctcata 
cctgacagag 
gagatatact 
tgtcagatta 
atcgtcttca 
tcagaaaagc 
aagtccatag 
cgtcgactag 
atctcccagc 
accaaggaag 
cataaggtat 
ttcccatgtc 
attcagaaca 
attaacttca 
gcattcctga 
tcactattag 
ccaaagtaa 



ggcgactgcg 
gcaccttccg 
aaaaaaagac 
ccaagtctcc 
tacttcttat 
gttcattgtc 
ttttgggaac 
caggaagttc 
atgaacacct 
atgaaagttg 
tgaatgatgc 
tcagtaaatt 
ttgatgtaag 
ctgtttgtgt 
tttctgaagt 
aagaacatct 
tgacatttcc 
gtcatctcat 
gggttaatac 
ttaatgtgga 
ttcaatacat 
aggacttgaa 
ctaagattgc 
tactggaaga 
atgtccgggt 
ggtatggaac 
atgctgacct 
ttgagatttt 
tcatgagcaa 
actttgaatt 
tgtaccttga 
gcttgagcta 
atgacaagca 
atgaaaaaag 
ctgaacagcc 
ggactaaaga 
tcatacctca 
agcaggctgc 
ccaaacctct 
gatggtttgt 
accaaacaga 
tctcggaacc 
gcgggccacg 
cacctcacta 
aggacatgac 
acaaaccaaa 
aatataattt 
atatcatcaa 
ccgtccatgt 
aaaatgacat 
tgaccgaatt 
tggctgcaaa 
gtcttccaga 
agaaacaaac 
(SEQ ID NO: 3 



ctggctaatg 
ctcagtcctc 
ttacagcaaa 
tgaagacaag 
gagtgatccc 
ggatcctcca 
aaagaaatac 
aaatgccttt 
agaaggtgcc 
caaagacaga 
ctggagactc 
tgggacaggt 
acaagagcta 
tttaggtcga 
agagaacaaa 
taaacaactt 
catacctgac 
tgggcatgaa 
tcttgttggt 
cttgaccgag 
tcagaagtta 
tgctgttgct 
aggaatattg 
atttagacct 
tgccatagtt 
ccagtacaaa 
gaatgggaaa 
accgttagaa 
actttggttc 
tttcagccca 
gctcctcaaa 
tgatctccaa 
gccaatttta 
atttgaaatt 
tcaccagcat 
tgagttaaaa 
gctcctgtca 
attaggaatt 
ccttccaagt 
ttatcagcag 
catgcaaagc 
ttgcttcaac 
tcgagctaat 
cctagaaagc 
agaagaggcc 
gaagctatct 
tgacagagat 
attctacaag 
tcttgccagg 
aaatttgtca 
caagcgtggt 
actctgaaga 
gcctaagaaa 
aaaaaattgt 
1) 
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IDE (insulin-degrading enzyme, NM_004969) 



MRYRLAWLLHPALPSTFRSVLGARLPPPERLCGFQKKTYSKMNN 

PAIKRIGNHITKSPEDKREYRGLELANGIKVLLMSDPTTDKSSAALDVHIGSLSDPPN 
IAGLSHFCEHMLFLGTKKYPKENEYSQFLSEHAGSSNAFTSGEHTNYYFDVSHEHLEG 
ALDRFAQFFLCPLFDESCKDREVNAVDSEHEKNVMNDAWRLFQLEKATGNPKHPFSKF 
GTGNKYTLETRPNQEGIDVRQELLKFHSAYYSSNLMAVCVLGRESLDDLTNLWKLFS 
EVENKWPLPEFPEHPFQEEHLKQLYKIVPIKDIRNLYVTFPIPDLQKYYKSNPGHYL 
GHLIGHEGPGSLLSELKSKGWVNTLVGGQKEGARGFMFFIINVDLTEEGLLHVEDIIL 
HMFQYIQKLRAEGPQEWVFQECKDLNAVAFRFKDKERPRGYTSKIAGILHYYPLEEVL 
TAE YLLE E FRPDIi I EMVXjDKLR PENVRVAI VS KS FEGKTDRTE E WYGTQYKQE A I PDE 
VIKKWQNADLNGKFKLPTKNEFIPTNFEILPLEKEATPYPALIKDTVMSKLWFKQDDK 
KKKP KACLNF e f f s p f a yvd p lh cnmayl yl e ll kd s LNE YAYAAE lagl s ydl QNT I 
YGMYL SVKGYNDKQP I LLKKI I EKMATFE I DEKRFE I IKEAYMRSLNNFRAEQPHQHA 
MYYLRLLMTEVAWTKDELKEALDDVTLPRLKAFIPQLLSRLHIEAIiLHGNITKQAALG 
IMQIWEDTLIEHAHTKPLLPSQLVRYREVQLPDRGWFWQQRNEVHlSnsrCGIEIYYQTD 
MQSTSENMFLELFCQIISEPCFNTLRTKEQLGYIVFSGPRRANGIQSLRFIIQSEKPP 
HYLESRVEAFLITMEKSIEDMTEEAFQKHIQALAIRRLDKPKKLSAECAKYWGEIISQ 
QYNFDRDNTE VAYLKTLTKED 1 1 KF YKEMIiAVDAPRRHKVS VHVLAREMDS C P WGEF 
PCQNDINLSQAPALPQPEVIQNMTEFKRGLPLFPLVKPHINFMAAKL (SEQ ID NO: 32) 
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MYOIA (myosin- 1A, NM_0 053 79) 

1 cagggagcct gggctggaag aggcagcaaa agggaaaatc agaagagtgg acactggcaa 
61 gaggagggca gcctttttcc cagcttcctt gcaccatgga cagctcccat taagccacct 
121 ctccatcctg gggccaggac tcttatgccc cattcctgtc aaattgagat ttcatccacc 
181 attctccaag gacagtgaag ttatacccta gttccagtgt tgggatcagt ggcccctctg 
241 gacatgcctc tcctggaagg ttctgtgggg gtggaggatc ttgtcctcct ggaacccttg 
301 gtggaggagt cactgctcaa gaatcttcag cttcgctatg aaaacaagga gatttatacc 
361 tacattggga atgtggtgat ctcagtgaat ccctatcaac agcttcccat ctatgggcca 
421 gagttcattg ccaaatatca agactatact ttctatgagc tgaagcccca tatctacgca 
481 ttggcaaatg tggcgtacca gtcactgagg gacagggacc gagaccagtg tatcctcatc 
541 acaggcgaga gtggatcagg gaagactgag gccagcaagc tggtgatgtc ttatgtggct 
601 gccgtctgtg ggaaaggaga gcaggtgaac tctgtgaagg agcagctgct acagtctaac 
661 ccagtgctgg aggcttttgg caatgccaag accattcgca acaacaattc ctcccgattt 
721 ggaaaataca tggatattga atttgacttc aagggatccc ccctcggtgg tgtcatcaca 
781 aactatctgc ttgagaaatc ccgattagtg aagcagctca aaggagaaag gaacttccac 
841 atcttctatc agctgctggc tggagcagat gaacagctgc tgaaggccct gaagcttgag 
901 cgggatacaa ctggctatgc ctatctgaat catgaagtat ccagagtgga tggcatggac 
961 gacgcctcca gcttcagggc tgtacagagt gcaatggcag tgattgggtt ctcggaggag 
1021 gagattcgac aagtgctaga ggtgacatcc atggtgctaa agctggggaa cgtgttggtg 
1081 gctgatgagt tccaggccag tgggatacca gcaagtggca tccgtgatgg gagaggtgtt 
1141 cgggagattg gggagatggt gggcttgaat tcagaagaag tagagagagc tttgtgctcg 
1201 aggaccatgg aaacagccaa ggaaaaggtg gtcactgcac tgaatgttat gcaggctcag 
1261 tatgctcggg acgccctggc taagaacatc tacagccgcc tctttgactg gatagtgaat 
1321 cgaatcaatg agagcatcaa ggtgggcatc ggggaaaaga agaaggtaat gggagtcctt 
13 81 gatatctacg gttttgagat attagaggat aatagctttg agcaatttgt gatcaactac 
1441 tgcaatgaga agctgcagca ggtgttcata gagatgaccc tgaaagaaga gcaagaggaa 
15 01 tataagagag aaggcatacc gtggacaaag gtggactact ttgataatgg catcatttgt 
1561 aagctcattg agcataatca gcgaggtatc ctggccatgt tggatgagga gtgcctgcgg 
1621 cctggggtgg tcagtgactc cactttccta gcaaagctga accagctctt ctccaagcat 
1681 ggccactacg agagcaaagt cacccagaat gcccagcgtc agtatgacca caccatgggc 
1741 ctcagctgct tccgcatctg ccactatgcg ggcaaggtga catacaacgt gaccagcttt 
1801 attgacaaga ataatgacct actcttccga gacctgttgc aggccatgtg gaaggcccag 
1861 caccccctcc ttcggtcctt gtttcctgag ggcaatccta agcaggcatc tctcaaacgc 
1921 cccccgactg ctggggccca gttcaagagt tctgtggcca tcctcatgaa gaatctgtat 
1981 tccaagagcc ccaactacat caggtgcata aagcccaatg agcatcagca gcgaggtcag 
2 041 ttctcttcag acctggtggc aacccaggct cggtacctgg gactgctgga gaacgtacgg 
2101 gtgcgacggg caggctatgc ccaccgccag ggttatgggc ccttcctgga aaggtaccga 
2161 ttgctgagcc ggagcacctg gcctcactgg aatgggggag accgggaagg tgttgagaag 
2221 gtcctggggg agctgagcat gtcctcgggg gagctggcct ttggcaagac aaagatcttc 
22 81 attagaagcc ccaagactct tttctacctc gaagaacaga ggcgcctgag actccagcag 
2341 ctggccacac tcatacagaa gatttaccga ggctggcgct gccgcaccca ctaccaactg 
24 01 atgcgaaaga gtcagatcct catctcctct tggtttcggg gaaacatgca aaagaaatgc 
2461 tatgggaaga taaaggcatc cgtgttattg atccaggctt ttgtgagagg gtggaaggcc 
2521 cgaaagaatt atcgcaaata tttccggtca gaggctgccc tcaccttggc agatttcatc 
2581 tacaagagca tggtacagaa attcctactg gggctgaaga acaatttgcc atccacaaac 
2 641 gtcttagaca agacatggcc agccgccccc tacaagtgcc tcagcacagc aaatcaggag 
2701 ctgcagcagc tcttctacca gtggaagtgc aagaggttcc gggatcagct gtccccgaag 
2761 caggtagaga tcctgaggga aaagctctgt gccagtgaac tgttcaaggg caagaaggct 
2 821 tcatatcccc agagtgtccc cattccattc tgtggtgact acattgggct gcaagggaac 
2 881 cccaagctgc agaagctgaa aggcggggag gaggggcctg ttctgatggc agaggccgtg 

2 941 aagaaggtca atcgtggcaa tggcaagact tcttctcgga ttctcctcct gaccaagggc 

3 001 catgtgattc tcacagacac caagaagtcc caggccaaaa ttgtcattgg gctagacaat 
3 061 gtggctgggg tgtcagtcac cagcctcaag gatgggctct ttagcttgca tctgagtgag 
3121 atgtcatcgg tgggctccaa gggggacttc ctgctggtca gcgagcatgt gattgaactg 
3181 ctgaccaaaa tgtaccgggc tgtgctggat gccacgcaga ggcagcttac agtcaccgtg 
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3241 actgagaagt tctcagtgag gttcaaggag aacagtgtgg 
33 01 cctgcaggtg gtgacaacag caagctacgc tacaaaaaaa 
3361 gtgactgtgc agtgaggagg gggcaccatg cagagatggc 
3421 gcactaatcc ccctctgccc tcctgtgtgg gaggatctct 
3481 atggcttggg gattaaacta cccttgaaga ggacccttgt 
3541 tcctccaaaa gtagcttcct ccaacccgca gcctctctgc 
3 601 ttggaaaggt tcaaaaaaaa aaaa (SEQ ID NO: 33) 



ctgtcaaggt cgtccagggc 
aggggagtca ttgcttggag 
agttgcttcc tcctgaacca 
aacccctctg atcgtggcgc 
cccaaaccct tcttgttctc 
acactaataa aacatgtggc 
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MYOIA (myosin-lA, NM_005379) 



PLLEGSVGVEDLVLLEPLVEESIiLKNLQLRYENKEIYTYIGNV 

ISVWPYQQLPIYGPEFIAKYQDYTFYELKPHIYALANVAYQSLRDRDRDQCIIilTGE 
GSGKTEASKLVMSYVAAVCGKGEQVNSVKEQLLQSNPVLEAFGNAKTIRNNNSSRFG 
YMDIEFDFKGSPLGGVITNYIiLEKSRLVKQLKGERNFHIFYQLLAGADEQLLKALKL 
RDTTGYAYLNHEVSRVDGMDDAS S FRAVQSAMAVI GFS EEE I RQVLEVTSMVLKLGN 
LVADEFQASGI PASGI RDGRGVRE I GEMVGLNSEEVERALCSRTMETAKEKWTALN 
MQAQYARDALAKNI YSRLFDWI VNRINE S I KVGI GEKKKVMGVLD I YGFE I LEDNS F 
QFVINYCNEKLQQVFIEMTLKEEQEEYKREGIPWTKVDYFDNGIICKLIEHNQRGIL 
MLDEECLRPGWSDSTFLAKLNQLFSKHGHYESKVTQNAQRQYDHTMGLSCFRICHY 
GKVTYIWTSFIDKmTDLLFRDLLQAMWKAQHPIiLRSLFPEGNPKQASLKRPPTAGAQ 
KS S VAI LMKNL YS KS PNYI RC I KPNEHQQRGQFS SDLVATQARYLGLLENVRVRRAG 
AHRQGYGPFLERYRLLSRSTWPHWNGGDREGVEKVLGELSMSSGEIA.FGKTKIFIRS 
KTL F YLE E QRRLRLQQLATL I QKI YRGWRCRTHYQLMRKS Q I L I S S WFRGNMQKKC Y 
KIKASVLLIQAFVRGWKARKNYRKYFRSEAALTIADFIYKSJWQKFLLGLKnSnSTLPST 
VLDKTWPAAPYKCLSTANQELQQLFYQWKCKRFRDQLSPKQVEILREKLCASELFKG 
KAS YPQS VP I PFCGDYI GLQGNPKXiQKLKGGEEGP VLMAEAVKKVNRGNGKTS SR I L 
LTKGHVILTDTKKSQAKIVIGLDNVAGVSVTSLKDGLFSLHIiSEMSSVGSKGDFLLV 
EHVIELLTKMYRAVLDATQRQLTVTVTEKFSVRFKENSVAVKWQGPAGGDNSKLRY 
KKGSHCLEVTVQ (SEQ ID NO: 34) 
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CYP2J2 (cytochrome P450 monooxygenase , NM_000775) 

1 gagccatgct cgcggcgatg ggctctctgg cggctgccct ctgggcagtg gtccatcctc 
61 ggactctcct actgggcact gtcgcctttc tgctcgctgc tgactttctc aaaagacggc 
121 gcccaaagaa ctacccgccg gggccctggc gcctgccctt ccttggcaac ttcttccttg 
181 tggacttcga gcagtcgcac ctggaggttc agctgtttgt gaagaaatat gggaaccttt 
241 ttagcttgga gcttggtgac atatctgcag ttcttattac tggcttgccc ttaatcaaag 
3 01 aagcccttat ccacatggac caaaactttg ggaaccgccc cgtgacccct atgcgagaac 

3 61 atatctttaa gaaaaatgga ttgattatgt caagtggcca ggcatggaag gagcaaagaa 
421 ggttcactct gacagcacta aggaactttg gtttaggaaa gaagagctta gaggaacgca 

4 81 ttcaggagga ggcccaacac ctcactgaag caataaaaga ggagaacgga cagccttttg 
541 accctcattt caagatcaac aatgcagttt ccaatatcat ttgctccatc accttcggag 
601 aacgctttga gtaccaggat agttggtttc agcagctgct gaagttacta gatgaagtca 
661 catacttgga ggcttcaaag acatgccagc tctacaatgt ctttccatgg ataatgaaat 
721 tcctgcctgg accccaccaa actctcttca gcaactggaa aaaactgaaa ttgtttgttt 
781 ctcatatgat tgacaaacac agaaaggatt ggaatcctgc agaaacaaga gactttattg 
841 atgcttacct taaagaaatg tcaaagcaca caggcaatcc tacttcaagt ttccatgaag 
901 aaaacctcat ctgcagcacc ctggacctct tctttgccgg aaccgagaca acttccacaa 
961 ctctgcgatg ggctctgctt tatatggccc tctacccaga aatccaagaa aaagtacaag 

1021 ctgagattga cagagtgatt ggccaggggc agcagccgag cacagccgcc cgggagtcca 
1081 tgccctacac caatgctgtc atccatgagg tgcagagaat gggcaacatc atccccctga 
1141 acgttcccag ggaagtgaca gttgatacca ctttggctgg gtaccacctg cccaagggta 
1201 ccatgatcct gaccaatttg acggcgctgc acagggaccc cacagagtgg gccacccctg 
1261 acacattcaa tccggaccat tttctggaga atggacagtt taagaaaagg gaagccttta 
1321 tgcctttctc aataggaaag cgggcatgcc tcggagaaca gttggccagg actgagctgt 
13 81 ttattttctt cacttccctt atgcaaaaat ttaccttcag gcccccaaac aatgagaagc 
1441 tgagcctgaa gtttagaatg ggtatcacca tttccccagt cagtcaccgc ctctgcgctg 
1501 ttcctcaggt gtaatattgt taagaaagaa aggggcaagg aaagtaagaa gacatggcac 
1561 gtgttctgaa accactggtg tctgctcaga tgtgttggga caaaatgaaa gtgactttca 
1621 agaaagatca gaggaatttg actcagagaa aactagatcc aaatcccagc tctactgtct 
1681 cgtccgaatt agccttggga aaatcattta tatgctaaat aatttacctt tttatctagg 
1741 agatgaaaag aggataatgt ttccttccat aaagaaagtt cttgtaagaa tcaaaagaaa 
18 01 tggtgagctt taagtggttt gtaaaccata aaacacatca taaaagttct atctataaaa 
1861 aaaaaaaaaa aaaaaa (SEQ ID NO: 35) 
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CYP2J2 (cytochrome P450 monooxygenase, NM_0 0 0775) 



LAAMGSLAAALWAVVHPRTLLLGTVAFLLAADFLKRRRPKNYP 

PGPWRLPFLGNFFLVDFEQSHLEVQLFVKKYGNLFSLELGDISAVLITGLPLIKEALI 
HMDQNFGNRPVTPMREHIFKKNGLIMSSGQAWKEQRRFTLTALRNFGLGKKSLEERIQ 
EEAQHLTEAIKEENGQPFDPHFKINNAVSNIICSITFGERFEYQDSWFQQLLKLLDEV 
TYLEASKTCQLYISTV'FPWIMKFLPGPHQ 

FIDAYLKEMSKHTGNPTSSFHEENLICSTLDLFFAGTETTSTTLRWALLYMALYPEIQ 
EKVQAEIDRVIGQGQQPSTAARESMPYTNAVIHEVQRMGNIIPLNVPREVTVDTTLAG 
YHLPKGTMILTNLTALHRDPTEWATPDTFNPDHFLENGQFKKREAFMPFSIGKRACLG 
EQLARTELFIFFTSLMQKFTFRPPNNEKLSLKFRMGITISPVSHRLCAVPQV (SEQ ID 

NO:36) 
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PHYH (phytanoyl-CoA-hydroxylase (Refsum disease) , NM__006214 ) 

1 gcccgctgcg gtaaatgggg cagaggccgg gaggggtggg ggttccccgc gccgcagcca 
61 tggagcagct tcgcgccgcc gcccgtctgc agattgttct gggccacctc ggccgcccct 
121 cggccggggc tgtcgtagct catcccactt cagggactat ttcctctgcc agtttccatc 
181 ctcaacaatt ccagtatact ctggataata atgttctaac cctggaacag agaaaatttt 
241 atgaagaaaa tgggtttcta gtaatcaaaa atcttgtacc tgatgccgat attcaacgct 
3 01 ttcggaatga gtttgaaaaa atctgcagaa aggaggtgaa accattagga ttaacagtaa 
3 61 tgagagatgt gaccatttcg aaatccgaat atgctccaag tgagaagatg atcacgaagg 
421 tccaggattt ccaggaagat aaggagctct tcagatactg cactctcccc gagattctga 
481 aatatgtgga gtgcttcact ggacctaata ttatggccat gcacacaatg ttgataaaca 
541 aacctccaga ttctggcaag aagacgtccc gtcaccccct gcaccaggac ctgcactatt 
601 tccccttcag gcccagcgat ctcatcgttt gcgcctggac ggcgatggag cacatcagcc 
661 ggaacaacgg ctgtctggtt gtgctcccag gcacacacaa gggctccctg aagccccacg 
721 attaccccaa gtgggagggg ggagttaaca aaatgttcca cgggatccag gactacgagg 
781 aaaacaaggc ccgggtgcac ctggtgatgg agaagggcga cactgttttc ttccatcctt 
841 tgctcatcca cggatctggt cagaataaaa cccagggatt ccggaaggca atttcctgcc 
9 01 atttcgccag tgccgattgc cactacattg acgtgaaggg caccagtcaa gaaaacatcg 
961 agaaggaagt tgtaggaata gcacataaat tctttggagc tgaaaatagc gtgaacttga 
1021 aggatatttg gatgtttcga gctcgacttg tgaaaggaga aagaaccaat ctttgaaata 
1081 gccatctgct ataactcttt caacagaaaa ccaaaaccaa acgaaatgtc taaggaaaat 
1141 gttttcttaa tgagatgatg taaccttttc tatcacttgt taaaagcaga aaacatgtat 
1201 caggtactta attgcataga gttagttttg cagcacaatg gtgttgcttt aatggaaaaa 
1261 aaaaacagta aaagtgaaat attactgttt taaggaaaac taatttaggg tggcagccaa 
1321 taaaggtggt tggtgtctaa tttaagtgtt aaatcaattt ctttcattca gttagctctt 
1381 tacccaagaa gaagtgaatg atttggagct tagggtatgt tttgtatccc ctttctgata 
1441 aacccattcc ctaccaattt tatgtcataa gagatttttt tcccccaaat ctagaacaat 
1501 gtataataca ttcacatcta gtcaagggca taggaacggt gtcatggagt ccaaataaag 
15 61 tggatattcc tgctcgg (SEQ ID NO: 37) 



FIGURE 22A 



PHYH (phytanoyl -Co A- hydroxylase (Refsum disease) , NM 006214) 



MEQLRAAARLQIVLGHLGRPSAGAWAHPTSGTISSASFHPQQF 
QYTLDlSnWLTliEQRKFYEEN^ 

DVTISKSEYAPSEKMITKVQDFQEDKELFRYCTLPEILKYVECFTGPNIMAMHTMLIN 
KPPDSGKKTSRHPLHQDLHYFPFRPSDLIVCAWTAMEHISRNNGCLWLPGTHKGSLK 
PHDYPKWEGGVNKMFHGIQDYEENKARVHLVMEKGDTVFFHPLLIHGSGQNKTQGFRK 
AISCHFASADCHYIDVKGTSQENIEKEWGIAHKFFGAENSVNLKDIWMFRARLVKGE 
RTNL. (SEQ ID NO: 38) 
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CYB5 (cytochrome b5 , 3 ' end, NM_001914) 



1 atggcagagc 
61 aaccacagca 
121 ctggaagagc 
181 gagaactttg 
241 attggggagc 
3 01 cggtgtttca 

3 61 tgggtgatcc 
421 gaggactgaa 

4 81 aagaagccat 
541 atatatctct 
601 cgctcaaatt 
661 gtgtaattta 

ID NO: 39) 



agtcggacga 
agagcacctg 
atcctggtgg 
aggatgtcgg 
tccatccaga 
aggaaactct 
ctgccatctc 
cacctcctca 
tgctaactac 
ttctttttct 
tttcgagtgt 
cttattataa 



ggccgtgaag 
gctgatcctg 
ggaagaagtt 
gcactctaca 
tgacagacca 
tatcactact 
tgcagtggcc 
gaagtcagcg 
ttcaactgac 
tccgacatta 
gcctttttat 
gcatgatctt 



tactacaccc 
caccacaagg 
ttaagggaac 
gatgccaggg 
aagttaaaca 
attgattcta 
gtcgccttga 
caggaagagc 
agaaaccttc 
gaaacaaaac 
tcatctactt 
ttaaaaatat 



tagaggagat 
tgtacgattt 
aagctggagg 
aaatgtccaa 
agcctccaga 
gttccagttg 
tgtatcgcct 
ctgctttgga 
acttgaaaac 
aaaaagaact 
tattttgatg 
atttggcttt 



tcagaagcac 
gaccaaattt 
tgacgctact 
aacattcatc 
accttaaagg 
gtggaccaac 
atacatggca 
cacgggagaa 
aatgatttta 
gtcctttctg 
tttccttaat 
taaagt (SEQ 
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CYB5 (cytochrome b5 , 3' end, NM 0 01914) 



NO:40) 



MAEQSDEAVKYYTLEEIQKHNHSKSTWLILHHKVYDLTKFLEEH 

PGGEEVLREQAGGDATENFEDVGHSTDAREMSKTFI IGELHPDDRPKLNKPPEP ( SEQ ID 



FIGURE 23B 



WO 2004/044178 



PCT/US2003/036260 



33/1 1 5 



COXVIb (coxVIb gene, last exon and flanking sequence, NMJ)01863) 

1 cctcctggga gggagctgaa gccgctcgca agactcccgt agtccccacc tctctcagct 

61 tccggctggt agtagttccg cttcctgtcc gactgtggtg tctttgctga gggtcacatt 

121 gagctgcagg ttgaatccgg ggtgccttta ggattcagca ccatggcgga agacatggag 

181 accaaaatca agaactacaa gaccgcccct tttgacagcc gcttccccaa ccagaaccag 

241 actagaaact gctggcagaa ctacctggac ttccaccgct gtcagaaggc aatgaccgct 

3 01 aaaggaggcg atatctctgt gtgcgaatgg taccagcgtg tgtaccagtc cctctgcccc 

3 61 acatcctggg tcacagactg ggatgagcaa cgggctgaag gcacgtttcc cgggaagatc 

421 tgaactggct gcatctccct ttcctctgtc ctccatcctt ctcccaggat ggtgaagggg 

481 gacctggtac ccagtgatcc ccaccccagg atcctaaatc atgacttacc tgctaataaa 

541 aactcattgg aaaagtgaaa aaaaaaaaaa aaaaaaaa (SEQ ID NO: 41) 



FIGURE 24A 



COXVIb (coxVIb gene, last exon and flanking sequence, NM_001863 ) 



MAEDMETKIKlsr^KTAPFDSRFPNQNQTRNCWQNYLDFHRCQKAM 

TAKGGDISVCEWYQRVYQSLCPTSWVTDWDEQRAEGTFPGKI (SEQ ID NO: 42) 
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TCF4 (NM 030756) 



1 ggtttttttt ttttaccccc cttttttatt tattattttt ttgcacattg agcggatcct 
61 tgggaacgag agaaaaaaga aacccaaact cacgcgtgca gaagatctcc ccccccttcc 
121 cctcccctcc tccctctttt cccctcccca ggagaaaaag acccccaagc agaaaaaagt 
181 tcaccttgga ctcgtctttt tcttgcaata ttttttgggg gggcaaaact ttgagggggt 
241 gatttttttt ggcttttctt cctccttcat ttttcttcca aaattgctgc tggtgggtga 
3 01 aaaaaaaatg ccgcagctga acggcggtgg aggggatgac ctaggcgcca acgacgaact 

3 61 gatttccttc aaagacgagg gcgaacagga ggagaagagc tccgaaaact cctcggcaga 
421 gagggattta gctgatgtca aatcgtctct agtcaatgaa tcagaaacga atcaaaacag 

4 81 ctcctccgat tccgaggcgg aaagacggcc tccgcctcgc tccgaaagtt tccgagacaa 
541 atcccgggaa agtttggaag aagcggccaa gaggcaagat ggagggctct ttaaggggcc 
601 accgtatccc ggctacccct tcatcatgat ccccgacctg acgagcccct acctccccaa 
661 cggatcgctc tcgcccaccg cccgaaccta tctccagatg aaatggccac tgcttgatgt 
721 ccaggcaggg agcctccaga gtagacaagc cctcaaggat gcccggtccc catcaccggc 
781 acacattgtc tctaacaaag tgccagtggt gcagcaccct caccatgtcc accccctcac 
841 gcctcttatc acgtacagca atgaacactt cacgccggga aacccacctc cacacttacc 
901 agccgacgta gaccccaaaa caggaatccc acggcctccg caccctccag atatatcccc 
961 gtattaccca ctatcgcctg gcaccgtagg acaaatcccc catccgctag gatggttagt 

1021 accacagcaa ggtcaaccag tgtacccaat cacgacagga ggattcagac acccctaccc 
1081 cacagctctg accgtcaatg cttccgtgtc caggttccct ccccatatgg tcccaccaca 
1141 tcatacgcta cacacgacgg gcattccgca tccggccata gtcacaccaa cagtcaaaca 
12 01 ggaatcgtcc cagagtgatg tcggctcact ccatagttca aagcatcagg actccaaaaa 

12 61 ggaagaagaa aagaagaagc cccacataaa gaaacctctt aatgcattca tgttgtatat 
1321 gaaggaaatg agagcaaagg tcgtagctga gtgcacgttg aaagaaagcg cggccatcaa 

13 81 ccagatcctt gggcggaggt ggcatgcact gtccagagaa gagcaagcga aatactacga 
1441 gctggcccgg aaggagcgac agcttcatat gcaactgtac cccggctggt ccgcgcggga 
1501 taactatgga aagaagaaga agaggaaaag ggacaagcag ccgggagaga ccaatgaaca 
1561 cagcgaatgt ttcctaaatc cttgcctttc acttcctccg attacagacc tcagcgctcc 
1621 taagaaatgc cgagcgcgct ttggccttga tcaacagaat aactggtgcg gcccttgcag 
16 81 gagaaaaaaa aagtgcgttc gctacataca aggtgaaggc agctgcctca gcccaccctc 
1741 ttcagatgga agcttactag attcgcctcc cccctccccg aacctgctag gctcccctcc 
18 01 ccgagacgcc aagtcacaga ctgagcagac ccagcctctg tcgctgtccc tgaagcccga 
1861 ccccctggcc cacctgtcca tgatgcctcc gccacccgcc ctcctgctcg ctgaggccac 
1921 ccacaaggcc tccgccctct gtcccaacgg ggccctggac ctgcccccag ccgctttgca 
1981 gcctgccgcc ccctcctcat caattgcaca gccgtcgact tcttggttac attcccacag 
2 041 ctccctggcc gggacccagc cccagccgct gtcgctcgtc accaagtctt tagaatagct 
2101 ttagcgtcgt gaaccccgct gctttgttta tggttttgtt tcacttttct taatttgccc 
2161 cccaccccca ccttgaaagg ttttgttttg tactctctta attttgtgcc atgtggctac 
2221 attagttgat gtttatcgag ttcattggtc aatatttgac ccattcttat ttcaatttct 
22 81 ccttttaaat atgtagatga gagaagaacc tcatgattgg taccaaaatt tttatcaaca 
2341 gctgtttaaa gtctttgtag cgtttaaaaa atatatatat atacataact gttatgtagt 
2401 tcggatagct tagttttaaa agactgatta aaaaacaaaa aaaa (SEQ ID NO: 43) 
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TCF4 (NM_03 0756) 



MPQLNGGGGDDLGANDEL I S FKDEGEQE EKS SENS S AERDLADV 

KSSLVTSTESETNQNSSSDSEAERRPPPRSESFRDKSRESLEEAAKRQDGGLFKGPPYPG 
YPFIMIPDLTSPYLPNGSLSPTARTYLQMKWPLLDVQAGSLQSRQALKDARSPSPAHI 
VSNKVPWQHPHHVHPLTPLITYSNEHFTPGNPPPHLPADVDPKTGIPRPPHPPDISP 
YYPLSPGTVGQIPHPLGWLVPQQGQPVYPITTGGFRHPYPTALTWASVSRFPPHMVP 
PHHTLHTTGIPHPAIVTPTVKQESSQSDVGSLHSSKHQDSKKEEEKKKPHIKKPLNAF 
MLYMKEMRAKVVAECTLKESAAINQILGRRWHALSREEQAKYYELARKERQLHMQLYP 
GWSARDNYGKKKKRKRDKQPGETNEHSECFLNPCLSLPPITDLSAPKKCRARFGLDQQ 
NNWCGPCRRKKKCVRYIQGEGSCLSPPSSDGSLLDSPPPSPNLLGSPPRDAKSQTEQT 
QPLSLSLKPDPIiAHLSMMPPPPALLIiAEATHKASALCPlsrGALDLPPAALQPAAPSSSI 
AQPSTSWLHSHSSIiAGTQPQPLSLVTKSLE (SEQ ID NO: 44) 
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CAD 17 (liver- intestine cadherin, NM_004063) 



1 agggagtgtt cccgggggag atactccagt cgtagcaaga gtctcgacca ctgaatggaa 
61 gaaaaggact tttaaccacc attttgtgac ttacagaaag gaatttgaat aaagaaaact 
121 atgatacttc aggcccatct tcactccctg tgtcttctta tgctttattt ggcaactgga 
181 tatggccaag aggggaagtt tagtggaccc ctgaaaccca tgacattttc tatttatgaa 
241 ggccaagaac cgagtcaaat tatattccag tttaaggcca atcctcctgc tgtgactttt 
3 01 gaactaactg gggagacaga caacatattt gtgatagaac gggagggact tctgtattac 
3 61 aacagagcct tggacaggga aacaagatct actcacaatc tccaggttgc agccctggac 
421 gctaatggaa ttatagtgga gggtccagtc cctatcacca tagaagtgaa ggacatcaac 
481 gacaatcgac ccacgtttct ccagtcaaag tacgaaggct cagtaaggca gaactctcgc 
541 ccaggaaagc ccttcttgta tgtcaatgcc acagacctgg atgatccggc cactcccaat 
601 ggccagcttt attaccagat tgtcatccag cttcccatga tcaacaatgt catgtacttt 
661 cagatcaaca acaaaacggg agccatctct cttacccgag agggatctca ggaattgaat 
721 cctgctaaga atccttccta taatctggtg atctcagtga aggacatggg aggccagagt 
781 gagaattcct tcagtgatac cacatctgtg gatatcatag tgacagagaa tatttggaaa 
841 gcaccaaaac ctgtggagat ggtggaaaac tcaactgatc ctcaccccat caaaatcact 
901 caggtgcggt ggaatgatcc cggtgcacaa tattccttag ttgacaaaga gaagctgcca 
961 agattcccat tttcaattga ccaggaagga gatatttacg tgactcagcc cttggaccga 
1021 gaagaaaagg atgcatatgt tttttatgca gttgcaaagg atgagtacgg aaaaccactt 
1081 tcatatccgc tggaaattca tgtaaaagtt aaagatatta atgataatcc acctacatgt 
1141 ccgtcaccag taaccgtatt tgaggtccag gagaatgaac gactgggtaa cagtatcggg 

12 01 acccttactg cacatgacag ggatgaagaa aatactgcca acagttttct aaactacagg 
1261 attgtggagc aaactcccaa acttcccatg gatggactct tcctaatcca aacctatgct 
1321 ggaatgttac agttagctaa acagtccttg aagaagcaag atactcctca gtacaactta 

13 81 acgatagagg tgtctgacaa agatttcaag accctttgtt ttgtgcaaat caacgttatt 
1441 gatatcaatg atcagatccc catctttgaa aaatcagatt atggaaacct gactcttgct 
15 01 gaagacacaa acattgggtc caccatctta accatccagg ccactgatgc tgatgagcca 
1561 tttactggga gttctaaaat tctgtatcat atcataaagg gagacagtga gggacgcctg 
1621 ggggttgaca cagatcccca taccaacacc ggatatgtca taattaaaaa gcctcttgat 
1681 tttgaaacag cagctgtttc caacattgtg ttcaaagcag aaaatcctga gcctctagtg 
1741 tttggtgtga agtacaatgc aagttctttt gccaagttca cgcttattgt gacagatgtg 
1801 aatgaagcac ctcaattttc ccaacacgta ttccaagcga aagtcagtga ggatgtagct 
18 61 ataggcacta aagtgggcaa tgtgactgcc aaggatccag aaggtctgga cataagctat 
1921 tcactgaggg gagacacaag aggttggctt aaaattgacc acgtgactgg tgagatcttt 
1981 agtgtggctc cattggacag agaagccgga agtccatatc gggtacaagt ggtggccaca 
2041 gaagtagggg ggtcttcctt gagctctgtg tcagagttcc acctgatcct tatggatgtg 
2101 aatgacaacc ctcccaggct agccaaggac tacacgggct tgttcttctg ccatcccctc 
2161 agtgcacctg gaagtctcat tttcgaggct actgatgatg atcagcactt atttcggggt 
2221 ccccatttta cattttccct cggcagtgga agcttacaaa acgactggga agtttccaaa 
22 81 atcaatggta ctcatgcccg actgtctacc aggcacacag agtttgagga gagggagtat 
2341 gtcgtcttga tccgcatcaa tgatgggggt cggccaccct tggaaggcat tgtttcttta 

24 01 ccagttacat tctgcagttg tgtggaagga agttgtttcc ggccagcagg tcaccagact 
2461 gggataccca ctgtgggcat ggcagttggt atactgctga ccacccttct ggtgattggt 
2521 ataattttag cagttgtgtt tatccgcata aagaaggata aaggcaaaga taatgttgaa 

25 81 agtgctcaag catctgaagt caaacctctg agaagctgaa tttgaaaagg aatgtttgaa 
2641 tttatatagc aagtgctatt tcagcaacaa ccatctcatc ctattacttt tcatctaacg 
2701 tgcattataa ttttttaaac agatattccc tcttgtcctt taatatttgc taaatatttc 
2761 ttttttgagg tggagtcttg ctctgtcgcc caggctggag tacagtggtg tgatcccagc 
2821 tcactgcaac ctccgcctcc tgggttcaca tgattctcct gcctcagctt cctaagtagc 
2881 tgggtttaca ggcacccacc accatgccca gctaattttt gtatttttaa tagagacggg 
2 941 gtttcgccat ttggccaggc tggtcttgaa ctcctgacgt caagtgatct gcctgccttg 
3001 gtctcccaat acaggcatga accactgcac ccacctactt agatatttca tgtgctatag 
3061 acattagaga gatttttcat ttttccatga catttttcct ctctgcaaat ggcttagcta 
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3121 cttgtgtttt tcccttttgg ggcaagacag actcattaaa tattctgtac attttttctt 
3181 tatcaaggag atatatcagt gttgtctcat agaactgcct ggattccatt tatgtttttt 
3241 ctgattccat cctgtgtccc cttcatcctt gactcctttg gtatttcact gaatttcaaa 
33 01 catttgtcag agaagaaaaa cgtgaggact caggaaaaat aaataaataa aagaacagcc 
3361 ttttccctta gtattaacag aaatgtttct gtgtcattaa ccatctttaa tcaatgtgac 
3421 atgttgctct ttggctgaaa ttcttcaact tggaaatgac acagacccac agaaggtgtt 
3481 caaacacaac ctactctgca aaccttggta aaggaaccag tcagctggcc agatttcctc 
3541 actacctgcc atgcatacat gctgcgcatg ttttcttcat tcgtatgtta gtaaagtttt 
3 6 01 ggttattata tatttaacat gtggaagaaa acaagacatg aaaagagtgg tgacaaatca 
3661 agaataaaca ctggttgtag tcagttttgt ttgttaa (SEQ ID No: 45) 
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CADI 7 {liver-intestine caciherin, NM_004063) 



MILQAHLHSLCLLMLYLATGYGQEGKFSGPLKPMTFSIYEGQEP 

SQIIFQFKANPPAVTFELTGETDNIFVIEREGLLYYNRALDRETRSTHNLQVAALDAN 
GIIVEGPVPITIEVKDINDNRPTFLQSKYEGSVRQNSRPGKPFLYVNATDLDDPATPN 
GQLYYQIVIQLPMINNViyiYFQINNKTGAISLTREGSQELNPAKNPSYNLVISVKDMG^ 
QSENSFSDTTSVDIIVTENIWKAPKPVEMVENSTDPHPIKITQVRWNDPGAQYSLVDK 
EKLPRFPFSIDQEGDIYVTQPLDREEKDAYVFYAVAKDEYGKPLSYPLEIHVKVKDIN 
DNPPTCPSPVTVFEVQENERLGNSIGTLTAHDRDEENTANSFLNYRIVEQTPKLPMDG 
LFLIQTYAGMLQLAKQSLKKQDTPQYNLTIEVSDKDFKTLCFVQINVIDINDQIPIFE 
KSDYGNLTLAEDTNIGSTILTIQATDADEPFTGSSKILYHIIKGDSEGRLGVDTDPHT 
NTGYVI I KKPLDFETAAVSNI VFKAENPE PLVFGVKYNAS S FAKFTL I VTDVNE APQF 
SQHVFQAKVSEDVAIGTKVGNVTAKDPEGLDISYSLRGDTRGWLKIDHVTGEIFSVAP 
LDRE AGS P YRVQ WATE VGGS S L S S VS E FHL I LMDVNDNP PRLAKD YTGLF FCHPL S A 
PGSLIFEATDDDQHLFRGPHFTFSLGSGSLQNDWEVSKINGTHARLSTRHTEFEEREY 
WLIRINDGGRPPLEGIVSLPVTFCSCVEGSCFRPAGHQTGIPTVGMAVGILLTTLLV 
I G 1 1 LAWF I R I KKDKGKD3SJVE S AQAS E VKPLRS (SEQ ID NO: 46) 
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CLDN15 (claudin 15, NMJD14343) 



1 ctcgtcaaca gctgccgcgc gcaggcttag ctcattcctc tgacctgcca ggaagcagag 
61 agacccacag agcaggaggg aggcagaaag tggagacgga cctgagcccg aggaagaggc 
121 aggcagaggc tgaggctgat tccaccccag cctgcctgga caaccctcct tagccgcagc 
181 cccttccagt tccctagggg ttctgcccct ccccctctct ggggcaccag ccccccaggg 
241 tcctgcatcc caccatgtcg atggctgtgg aaacctttgg cttcttcatg gcaactgtgg 
3 01 ggctgctgat gctgggggtg actctgccaa acagctactg gcgagtgtcc actgtgcacg 
3 61 ggaacgtcat caccaccaac accatcttcg agaacctctg gtttagctgt gccaccgact 
421 ccctgggcgt ctacaactgc tgggagttcc cgtccatgct ggccctctct gggtatattc 
481 aggcctgccg ggcactcatg atcaccgcca tcctcctggg cttcctcggc ctcttgctag 
541 gcatagcggg cctgcgctgc accaacattg ggggcctgga gctctccagg aaagccaagc 
601 tggcggccac cgcaggggcc ctccacattc tggccggtat ctgcgggatg gtggccatct 
661 cctggtacgc cttcaacatc acccgggact tcttcgaccc cttgtacccc ggaaccaagt 
721 acgagctggg ccccgccctc tacctggggt ggagcgcctc actgatctcc atcctgggtg 
781 gcctctgcct ctgctccgcc tgctgctgcg gctctgacga ggacccagcc gccagcgccc 
841 ggcggcccta ccaggctccc gtgtccgtga tgcccgtcgc cacctcggac caagaaggcg 
901 acagcagctt tggcaaatac ggcagaaacg cctacgtgta gcagctctgg cccgtgggcc 
961 ccgctgtctt cccactgccc caaggagagg ggacctggcc ggggcccatt cccctatagt 
1021 aacctcaggg gccggccacg ccccgctccc gtagccccgc cccggccacg gccccgtgtc 
1081 ttgcactctc atggcccctc caggccaaga actgctcttg ggaagtcgca tatctcccct 
1141 ctgaggctgg atccctcatc ttctgaccct gggttctggg ctgtgaaggg gacggtgtcc 
12 01 ccgcacgttt gtattgtgta taaatacatt cattaataaa tgcatattgt gaccgttc 
(SEQ ID NO: 47) 



FIGURE 27A 



CLDN15 (claudin 15, NM_014343) 



MSMAVETFGFFMATVGLLMLGVTLPNSYWRVSTVHGNVITTNTI 

FENLWFSCATDSLGVYNCWEFPSMLALSGYIQACRALMITAILLGFLGLLLGIAGLRC 
TNIGGLELSRKAKLAATAGALHILAGICGMVAISWYAFNITRDFFDPLYPGTKYELGP 
ALYLGWSASLISILGGLCLCSACCCGSDEDPAASARRPYQAPVSVMPVATSDQEGDSS 
FGKYGRNAYV (SEQ ID NO: 48) 
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CFTR (chloride channel, NM_0 0 0492) 

1 aattggaagc aaatgacatc acagcaggtc agagaaaaag ggttgagcgg caggcaccca 
61 gagtagtagg tctttggcat taggagcttg agcccagacg gccctagcag ggaccccagc 
121 gcccgagaga ccatgcagag gtcgcctctg gaaaaggcca gcgttgtctc caaacttttt 
181 ttcagctgga ccagaccaat tttgaggaaa ggatacagac agcgcctgga attgtcagac 
241 atataccaaa tcccttctgt tgattctgct gacaatctat ctgaaaaatt ggaaagagaa 
3 01 tgggatagag agctggcttc aaagaaaaat cctaaactca ttaatgccct tcggcgatgt 
3 61 tttttctgga gatttatgtt ctatggaatc tttttatatt taggggaagt caccaaagca 
421 gtacagcctc tcttactggg aagaatcata gcttcctatg acccggataa caaggaggaa 
481 cgctctatcg cgatttatct aggcataggc ttatgccttc tctttattgt gaggacactg 
541 ctcctacacc cagccatttt tggccttcat cacattggaa tgcagatgag aatagctatg 
601 tttagtttga tttataagaa gactttaaag ctgtcaagcc gtgttctaga taaaataagt 
661 attggacaac ttgttagtct cctttccaac aacctgaaca aatttgatga aggacttgca 
721 ttggcacatt tcgtgtggat cgctcctttg caagtggcac tcctcatggg gctaatctgg 
781 gagttgttac aggcgtctgc cttctgtgga cttggtttcc tgatagtcct tgcccttttt 
841 caggctgggc tagggagaat gatgatgaag tacagagatc agagagctgg gaagatcagt 
901 gaaagacttg tgattacctc agaaatgatt gaaaatatcc aatctgttaa ggcatactgc 
961 tgggaagaag caatggaaaa aatgattgaa aacttaagac aaacagaact gaaactgact 
1021 cggaaggcag cctatgtgag atacttcaat agctcagcct tcttcttctc agggttcttt 
1081 gtggtgtttt tatctgtgct tccctatgca ctaatcaaag gaatcatcct ccggaaaata 
1141 ttcaccacca tctcattctg cattgttctg cgcatggcgg tcactcggca atttccctgg 
12 01 gctgtacaaa catggtatga ctctcttgga gcaataaaca aaatacagga tttcttacaa 

12 61 aagcaagaat ataagacatt ggaatataac ttaacgacta cagaagtagt gatggagaat 

13 21 gtaacagcct tctgggagga gggatttggg gaattatttg agaaagcaaa acaaaacaat 
13 81 aacaatagaa aaacttctaa tggtgatgac agcctcttct tcagtaattt ctcacttctt 
1441 ggtactcctg tcctgaaaga tattaatttc aagatagaaa gaggacagtt gttggcggtt 
1501 gctggatcca ctggagcagg caagacttca cttctaatga tgattatggg agaactggag 
1561 ccttcagagg gtaaaattaa gcacagtgga agaatttcat tctgttctca gttttcctgg 
1621 attatgcctg gcaccattaa agaaaatatc atctttggtg tttcctatga tgaatataga 
16 81 tacagaagcg tcatcaaagc atgccaacta gaagaggaca tctccaagtt tgcagagaaa 
1741 gacaatatag ttcttggaga aggtggaatc acactgagtg gaggtcaacg agcaagaatt 
1801 tctttagcaa gagcagtata caaagatgct gatttgtatt tattagactc tccttttgga 
1861 tacctagatg ttttaacaga aaaagaaata tttgaaagct gtgtctgtaa actgatggct 
1921 aacaaaacta ggattttggt cactfcctaaa atggaacatt taaagaaagc tgacaaaata 
19 81 ttaattttga atgaaggtag cagctatttt tatgggacat tttcagaact ccaaaatcta 
2 041 cagccagact ttagctcaaa actcatggga tgtgattctt tcgaccaatt tagtgcagaa 
2101 agaagaaatt caatcctaac tgagacctta caccgtttct cattagaagg agatgctcct 
2161 gtctcctgga cagaaacaaa aaaacaatct tttaaacaga ctggagagtt tggggaaaaa 
2221 aggaagaatt ctattctcaa tccaatcaac tctatacgaa aattttccat tgtgcaaaag 
22 81 actcccttac aaatgaatgg catcgaagag gattctgatg agcctttaga gagaaggctg 
2341 tccttagtac cagattctga gcagggagag gcgatactgc ctcgcatcag cgtgatcagc 
24 01 actggcccca cgcttcaggc acgaaggagg cagtctgtcc tgaacctgat gacacactca 
2461 gttaaccaag gtcagaacat tcaccgaaag acaacagcat ccacacgaaa agtgtcactg 
2521 gcccctcagg caaacttgac tgaactggat atatattcaa gaaggttatc tcaagaaact 
2581 ggcttggaaa taagtgaaga aattaacgaa gaagacttaa aggagtgcct ttttgatgat 
2641 atggagagca taccagcagt gactacatgg aacacatacc ttcgatatat tactgtccac 
2701 aagagcttaa tttttgtgct aatttggtgc ttagtaattt ttctggcaga ggtggctgct 
2761 tctttggttg tgctgtggct ccttggaaac actcctcttc aagacaaagg gaatagtact 
2 821 catagtagaa ataacagcta tgcagtgatt atcaccagca ccagttcgta ttatgtgttt 
2881 tacatttacg tgggagtagc cgacactttg cttgctatgg gattcttcag aggtctacca 
2941 ctggtgcata ctctaatcac agtgtcgaaa attttacacc acaaaatgtt acattctgtt 
3001 cttcaagcac ctatgtcaac cctcaacacg ttgaaagcag gtgggattct taatagattc 
3061 tccaaagata tagcaatttt ggatgacctt ctgcctctta ccatatttga cttcatccag 
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3121 ttgttattaa ttgtgattgg agctatagca gttgtcgcag ttttacaacc ctacatcttt 
3181 gttgcaacag tgccagtgat agtggctttt attatgttga gagcatattt cctccaaacc 
3241 tcacagcaac tcaaacaact ggaatctgaa ggcaggagtc caattttcac tcatcttgtt 
33 01 acaagcttaa aaggactatg gacacttcgt gccttcggac ggcagcctta ctttgaaact 
3361 ctgttccaca aagctctgaa tttacatact gccaactggt tcttgtacct gtcaacactg 
3421 cgctggttcc aaatgagaat agaaatgatt tttgtcatct tcttcattgc tgttaccttc 
3481 atttccattt taacaacagg agaaggagaa ggaagagttg gtattatcct gactttagcc 
3541 atgaatatca tgagtacatt gcagtgggct gtaaactcca gcatagatgt ggatagcttg 
3601 atgcgatctg tgagccgagt ctttaagttc attgacatgc caacagaagg taaacctacc 
3 661 aagtcaacca aaccatacaa gaatggccaa ctctcgaaag ttatgattat tgagaattca 
3721 cacgtgaaga aagatgacat ctggccctca gggggccaaa tgactgtcaa agatctcaca 
3 781 gcaaaataca cagaaggtgg aaatgccata ttagagaaca tttccttctc aataagtcct 
3 841 ggccagaggg tgggcctctt gggaagaact ggatcaggga agagtacttt gttatcagct 
3 901 tttttgagac tactgaacac tgaaggagaa atccagatcg atggtgtgtc ttgggattca 

3 961 ataactttgc aacagtggag gaaagccttt ggagtgatac cacagaaagt atttattttt 

4 021 tctggaacat ttagaaaaaa cttggatccc tatgaacagt ggagtgatca agaaatatgg 
40 81 aaagttgcag atgaggttgg gctcagatct gtgatagaac agtttcctgg gaagcttgac 
4141 tttgtccttg tggatggggg ctgtgtccta agccatggcc acaagcagtt gatgtgcttg 

42 01 gctagatctg ttctcagtaa ggcgaagatc ttgctgcttg atgaacccag tgctcatttg 
4261 gatccagtaa cataccaaat aattagaaga actctaaaac aagcatttgc tgattgcaca 
4321 gtaattctct gtgaacacag gatagaagca atgctggaat gccaacaatt tttggtcata 

43 81 gaagagaaca aagtgcggca gtacgattcc atccagaaac tgctgaacga gaggagcctc 
4441 ttccggcaag ccatcagccc ctccgacagg gtgaagctct ttccccaccg gaactcaagc 
4501 aagtgcaagt ctaagcccca gattgctgct ctgaaagagg agacagaaga agaggtgcaa 
4561 gatacaaggc tttagagagc agcataaatg ttgacatggg acatttgctc atggaattgg 
4621 agctcgtggg acagtcacct catggaattg gagctcgtgg aacagttacc tctgcctcag 
46 81 aaaacaagga tgaattaagt ttttttttaa aaaagaaaca tttggtaagg ggaattgagg 
4741 acactgatat gggtcttgat aaatggcttc ctggcaatag tcaaattgtg tgaaaggtac 
4801 ttcaaatcct tgaagattta ccacttgtgt tttgcaagcc agattttcct gaaaaccctt 

48 61 gccatgtgct agtaattgga aaggcagctc taaatgtcaa tcagcctagt tgatcagctt 
4921 attgtctagt gaaactcgtt aatttgtagt gttggagaag aactgaaatc atacttctta 

49 81 gggttatgat taagtaatga taactggaaa cttcagcggt ttatataagc ttgtattcct 
5041 ttttctctcc tctccccatg atgtttagaa acacaactat attgtttgct aagcattcca 
5101 actatctcat ttccaagcaa gtattagaat accacaggaa ccacaagact gcacatcaaa 
5161 atatgcccca ttcaacatct agtgagcagt caggaaagag aacttccaga tcctggaaat 
5221 cagggttagt attgtccagg tctaccaaaa atctcaatat ttcagataat cacaatacat 
52 81 cccttacctg ggaaagggct gttataatct ttcacagggg acaggatggt tcccttgatg 
5341 aagaagttga tatgcctttt cccaactcca gaaagtgaca agctcacaga cctttgaact 
54 01 agagtttagc tggaaaagta tgttagtgca aattgtcaca ggacagccct tctttccaca 

54 61 gaagctccag gtagagggtg tgtaagtaga taggccatgg gcactgtggg tagacacaca 
5521 tgaagtccaa gcatttagat gtataggttg atggtggtat gttttcaggc tagatgtatg 

55 81 tacttcatgc tgtctacact aagagagaat gagagacaca ctgaagaagc accaatcatg 
5641 aattagtttt atatgcttct gttttataat tttgtgaagc aaaatttttt ctctaggaaa 
5701 tatttatttt aataatgttt caaacatata ttacaatgct gtattttaaa agaatgatta 
5761 tgaattacat ttgtataaaa taatttttat atttgaaata ttgacttttt atggcactag 
5821 tatttttatg aaatattatg ttaaaactgg gacaggggag aacctagggt gatattaacc 
58 81 aggggccatg aatcaccttt tggtctggag ggaagccttg gggctgatcg agttgttgcc 
5941 cacagctgta tgattcccag ccagacacag cctcttagat gcagttctga agaagatggt 
6001 accaccagtc tgactgtttc catcaagggt acactgcctt ctcaactcca aactgactct 
60 61 taagaagact gcattatatt tattactgta agaaaatatc acttgtcaat aaaatccata 
6121 catttgtgt (SEQ ID NO:49) 
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CFTR (chloride channel, NM__0 00492 ) 



MQRSPLEKASWSKXiFFSWTRPILRKGYRQRLELSDIYQIPSVD 

SADNLSEKLEREWDRELASKKNPKLINALRRCFFWRFMFYGIFLYLGEVTKA.VQPLLL 
GRIIASYDPDNKEERSIAIYIiGIGIiCLIiFIVRTIiLLHPAIFGLHHIGMQMRIAMFSIil 
YKKTLKXiSSRVLDKISIGQLVSLLSlsnSILNKFDEGLAIJmFWIAPLQVALLMGLIWEL 
LQAS AF C GLGF L I VLAL FQAGLGRMMMKYRDQRAGKI S E RL VI T S EM I ENI Q S VKAYC 
WEEAMEKMIENLRQTELKXiTRKAAYVRYFNS SAFFFSGFFWFLSVLPYALIKGIILR 
KIFTTISFCIVLRMAVTRQFPWAVQTWYDSLGAINKIQDFLQKQEYKTLEYNLTTTEV 
MENVTAFWEEGFGELFEKAKQNISnSTN^ 

QLLAVAGSTGAGKTSLLMMIMGELEPSEGKIKHSGRISFCSQFSWIMPGTIKENIIF 
VS YDE YR YRS VI KACQLEED I S KFAE KDNI VLGEGG I TL S GGQRAR I SLARAVYKDA 
L YLLD S PFGYLDVLTE KE I FE S C VCKLMANKTRI LVTS KMEHLKKADKI L I LNEGS S 
FYGTFSELQNLQPDFSSKLMGCDSFDQFSAERRNSILTETLHRFSLiEGDAPVSWTET 
KQSFKQTGEFGEKRKNSILNPINSIRKFSIVQKTPLQMNGIEEDSDEPLERRLSLVP 
SEQGEAILPRISVISTGPTLQARRRQSVLNLMTHSVNQGQNIHRKTTASTRKVSLAP 
ANLTELDIYSRRIiSQETGLEISEEINEEDLKECLFDDMESIPAVTTWNTYLRYITVH 
SLIFVLIWCLVIFLAEVAASLVVLWLLGNTPLQDKGNSTHSRNNSYAVIITSTSSYY 
FYIYVGVADTLLAMGFFRGLPLVHTLITVSKILHHKMLHSVLQAPMSTLNTLKAGGI 
NRFSKDIAILDDLLPLTIFDFIQLIiLIVIGAIAWAVLQPYIFVATVPVIVAFIMLR 
YFLQTSQQLKQLESEGRSPIFTHLVTSLKGLWTLRAFGRQPYFETLFHKALNLHTAN 
FLYLSTLRWFQMRIEMIFVIFFIAVTFISILTTGEGEGRVGIILTLAMNIMSTLQWA 
NSSIDVDSLMRSVSRVFKFIDMPTEGKPTKSTKPYKNGQLSKVMIIENSHVKKDDIW 
SGGQMTVKDLTAKYTEGGNAILENISFSISPGQRVGLLGRTGSGKSTLLSAFLRLLN 
EGE I QI DGVSWDS I TLQQWRKAFGVI PQKVF I FSGTFRKNLDP YEQWSDQE I WKVAD 
VGLRSVIEQFPGKLDFVLVDGGCVLSHGHKQLMCLARSVLSKAKILLLDEPSAHLDP 
TYQIIRRTLKQAFADCTVILCEHRIEAMLECQQFLVIEENKVRQYDSIQKXjLNERSL 
RQAISPSDRVKLFPHRNSSKCKSKPQIAALKEETEEEVQDTRL (SEQ ID NO: 50) 
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H2R (histamine H2 receptor, NM_0223 04) 



1 ctcctgccct ccactgactc cagagaggga gatccccagt acttgactcc atcacgcaga 
61 tgggagcagg caccagctat ggagagggat acagctgcgt ctccacatga cccatcctgc 
121 atgacaccaa agccaccgcc agacagtgcc tcggattcta tgcaaaacct gggaagcgga 
181 gacctacccc agccccggga ggaagctagc tcttcagggg accgtctgag gactggagtt 
241 tgatccatga acctggcttc gaggccttgc ttttctctct tcttcattca tattcattcc 
3 01 caacacctta gaaggtgttg cttaatttat ttctagaaaa gcagcccaga gtcagtcatt 
3 61 gaagccttcc ccaccccctg gccaaaaaaa aaaaaaaaaa aaaactggac acattttgga 
421 tctgttggga gcttggagtc cagtggttgg catagttgtc acattgggag cagagaagaa 
481 gcaaccaggg gccctgatca ggggactgag ccgtagagtc ccaggatggc acccaatggc 
541 acagcctctt ccttttgcct ggactctacc gcatgcaaga tcaccatcac cgtggtcctt 
601 gcggtcctca tcctcatcac cgttgctggc aatgtggtcg tctgtctggc cgtgggcttg 
661 aaccgccggc tccgcaacct gaccaattgt ttcatcgtgt ccttggctat cactgacctg 
721 ctcctcggcc tcctggtgct gcccttctct gccatctacc agctgtcctg caagtggagc 
781 tttggcaagg tcttctgcaa tatctacacc agcctggatg tgatgctctg cacagcctcc 
841 attcttaacc tcttcatgat cagcctcgac cggtactgcg ctgtcatgga cccactgcgg 
9 01 taccctgtgc tggtcacccc agttcgggtc gccatctctc tggtcttaat ttgggtcatc 
961 tccattaccc tgtcctttct gtctatccac ctggggtgga acagcaggaa cgagaccagc 
1021 aagggcaatc ataccacctc taagtgcaaa gtccaggtca atgaagtgta cgggctggtg 
1081 gatgggctgg tcaccttcta cctcccgcta ctgatcatgt gcatcaccta ctaccgcatc 
1141 ttcaaggtcg cccgggatca ggccaagagg atcaatcaca ttagctcctg gaaggcagcc 
12 01 accatcaggg agcacaaagc cacagtgaca ctggccgccg tcatgggggc cttcatcatc 

12 61 tgctggtttc cctacttcac cgcgtttgtg taccgtgggc tgagagggga tgatgccatc 
1321 aatgaggtgt tagaagccat cgttctgtgg ctgggctatg ccaactcagc cctgaacccc 

13 81 atcctgtatg ctgcgctgaa cagagacttc cgcaccgggt accaacagct cttctgctgc 
1441 aggctggcca accgcaactc ccacaaaact tctctgaggt ccaacgcctc tcagctgtcc 
1501 aggacccaaa gccgagaacc caggcaacag gaagagaaac ccctgaagct ccaggtgtgg 
1561 agtgggacag aagtcacggc cccccaggga gccacagaca ggtaatagcc ctagccattg 
1621 gtgcacagga tgggggcaat gggaggggat gctactgatg ggaatgatta agggagctgc 
16 81 tgtttaggtg gtgctggttt atgttctagg aactcttcat gagcactttg taaacaccct 
1741 cttgcttaat cctcccaacg gcccccaaag gtagaactta gctccctttt aaaaggagca 
1801 cattaaaatt ctcagaggac ttggcaaggg ccgcacagct ggggcat (SEQ ID NO: 51) 



FIGURE 29 A 



H2R (histamine H2 receptor, NM_0223 04) 



APNGTAS S FCLDS TACKI T I TWLAVL I L I TVAGNWVCLAVG 

NRRLRNLTNCFIVSLAITDLLLGLLVLPFSAIYQLSCKWSFGKVFCNIYTSLDVMLC 
AS I LNLFMI SLDRYCAVMDPLRY PVLVTPVRVAI SLVL I WVI S I TLSFLS I HLGWNS 
NETSKGNHTTSKCKVQVNEVYGLVDGLVTFYLPLLIMCITYYRIFKVARDQAKRINH 
SSWKAATIREHKATVTLAAVMGAFI I CWFPYFTAFVYRGLRGDDAINEVLEAI VLWL 
YANSALNPILYAALNRDFRTGYQQLFCCRLANRNSHKTSLRSNASQLSRTQSREPRQ 
EEKPLKLQVWSGTEVTAPQGATDR (SEQ ID NO: 52) 
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EGFR (NM__0 0522 8) 

1 gagctagccc cggcggccgc cgccgcccag accggacgac aggccacctc gtcggcgtcc 
61 gcccgagtcc ccgcctcgcc gccaacgcca caaccaccgc gcacggcccc ctgactccgt 
121 ccagtattga tcgggagagc cggagcgagc tcttcgggga gcagcgatgc gaccctccgg 
181 gacggccggg gcagcgctcc tggcgctgct ggctgcgctc tgcccggcga gtcgggctct 
241 ggaggaaaag aaagtttgcc aaggcacgag taacaagctc acgcagttgg gcacttttga 
3 01 agatcatttt ctcagcctcc agaggatgtt caataactgt gaggtggtcc ttgggaattt 
3 61 ggaaattacc tatgtgcaga ggaattatga tctttccttc ttaaagacca tccaggaggt 
421 ggctggttat gtcctcattg ccctcaacac agtggagcga attcctttgg aaaacctgca 
481 gatcatcaga ggaaatatgt actacgaaaa ttcctatgcc ttagcagtct tatctaacta 
541 tgatgcaaat aaaaccggac tgaaggagct gcccatgaga aatttacagg aaatcctgca 

6 01 tggcgccgtg cggttcagca acaaccctgc cctgtgcaac gtggagagca tccagtggcg 
661 ggacatagtc agcagtgact ttctcagcaa catgtcgatg gacttccaga accacctggg 
721 cagctgccaa aagtgtgatc caagctgtcc caatgggagc tgctggggtg caggagagga 

7 81 gaactgccag aaactgacca aaatcatctg tgcccagcag tgctccgggc gctgccgtgg 
841 caagtccccc agtgactgct gccacaacca gtgtgctgca ggctgcacag gcccccggga 
901 gagcgactgc ctggtctgcc gcaaattccg agacgaagcc acgtgcaagg acacctgccc 
961 cccactcatg ctctacaacc ccaccacgta ccagatggat gtgaaccccg agggcaaata 

1021 cagctttggt gccacctgcg tgaagaagtg tccccgtaat tatgtggtga cagatcacgg 
1081 ctcgtgcgtc cgagcctgtg gggccgacag ctatgagatg gaggaagacg gcgtccgcaa 
1141 gtgtaagaag tgcgaagggc cttgccgcaa agtgtgtaac ggaataggta ttggtgaatt 
12 01 taaagactca ctctccataa atgctacgaa tattaaacac ttcaaaaact gcacctccat 

12 61 cagtggcgat ctccacatcc tgccggtggc atttaggggt gactccttca cacatactcc 
1321 tcctctggat ccacaggaac tggatattct gaaaaccgta aaggaaatca cagggttttt 

13 81 gctgattcag gcttggcctg aaaacaggac ggacctccat gcctttgaga acctagaaat 
1441 catacgcggc aggaccaagc aacatggtca gttttctctt gcagtcgtca gcctgaacat 
15 01 aacatccttg ggattacgct ccctcaagga gataagtgat ggagatgtga taatttcagg 
1561 aaacaaaaat ttgtgctatg caaatacaat aaactggaaa aaactgtttg ggacctccgg 
1621 tcagaaaacc aaaattataa gcaacagagg tgaaaacagc tgcaaggcca caggccaggt 
1681 ctgccatgcc ttgtgctccc ccgagggctg ctggggcccg gagcccaggg actgcgtctc 
1741 ttgccggaat gtcagccgag gcagggaatg cgtggacaag tgcaaccttc tggagggtga 
1801 gccaagggag tttgtggaga actctgagtg catacagtgc cacccagagt gcctgcctca 
1861 ggccatgaac atcacctgca caggacgggg accagacaac tgtatccagt gtgcccacta 
1921 cattgacggc ccccactgcg tcaagacctg cccggcagga gtcatgggag aaaacaacac 
1981 cctggtctgg aagtacgcag acgccggcca tgtgtgccac ctgtgccatc caaactgcac 
2 041 ctacggatgc actgggccag gtcttgaagg ctgtccaacg aatgggccta agatcccgtc 
2101 catcgccact gggatggtgg gggccctcct cttgctgctg gtggtggccc tggggatcgg 
2161 cctcttcatg cgaaggcgcc acatcgttcg gaagcgcacg ctgcggaggc tgctgcagga 
2221 gagggagctt gtggagcctc ttacacccag tggagaagct cccaaccaag ctctcttgag 
22 81 gatcttgaag gaaactgaat tcaaaaagat caaagtgctg ggctccggtg cgttcggcac 
2341 ggtgtataag ggactctgga tcccagaagg tgagaaagtt aaaattcccg tcgctatcaa 
2401 ggaattaaga gaagcaacat ctccgaaagc caacaaggaa atcctcgatg aagcctacgt 
2461 gatggccagc gtggacaacc cccacgtgtg ccgcctgctg ggcatctgcc tcacctccac 
2521 cgtgcagctc atcacgcagc tcatgccctt cggctgcctc ctggactatg tccgggaaca 
2581 caaagacaat attggctccc agtacctgct caactggtgt gtgcagatcg caaagggcat 
2641 gaactacttg gaggaccgtc gcttggtgca ccgcgacctg gcagccagga acgtactggt 
2 701 gaaaacaccg cagcatgtca agatcacaga ttttgggctg gccaaactgc tgggtgcgga 

2 761 agagaaagaa taccatgcag aaggaggcaa agtgcctatc aagtggatgg cattggaatc 
2821 aattttacac agaatctata cccaccagag tgatgtctgg agctacgggg tgaccgtttg 
2881 ggagttgatg acctttggat ccaagccata tgacggaatc cctgccagcg agatctcctc 
2941 catcctggag aaaggagaac gcctccctca gccacccata tgtaccatcg atgtctacat 

3 0 01 gatcatggtc aagtgctgga tgatagacgc agatagtcgc ccaaagttcc gtgagttgat 
3061 catcgaattc tccaaaatgg cccgagaccc ccagcgctac cttgtcattc agggggatga 
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3121 aagaatgcat ttgccaagtc ctacagactc caacttctac cgtgccctga tggatgaaga 
3181 agacatggac gacgtggtgg atgccgacga gtacctcatc ccacagcagg gcttcttcag 
3241 cagcccctcc acgtcacgga ctcccctcct gagctctctg agtgcaacca gcaacaattc 
33 01 caccgtggct tgcattgata gaaatgggct gcaaagctgt cccatcaagg aagacagctt 
3361 cttgcagcga tacagctcag accccacagg cgccttgact gaggacagca tagacgacac 
3421 cttcctccca gtgcctgaat acataaacca gtccgttccc aaaaggcccg ctggctctgt 
3481 gcagaatcct gtctatcaca atcagcctct gaaccccgcg cccagcagag acccacacta 
3541 ccaggacccc cacagcactg cagtgggcaa ccccgagtat ctcaacactg tccagcccac 
3601 ctgtgtcaac agcacattcg acagccctgc ccactgggcc cagaaaggca gccaccaaat 
3661 tagcctggac aaccctgact accagcagga cttctttccc aaggaagcca agccaaatgg 
3721 catctttaag ggctccacag ctgaaaatgc agaataccta agggtcgcgc cacaaagcag 
3 781 tgaatttatt ggagcatgac cacggaggat agtatgagcc ctaaaaatcc agactctttc 
3 841 gatacccagg accaagccac agcaggtcct ccatcccaac agccatgccc gcattagctc 
3 901 ttagacccac agactggttt tgcaacgttt acaccgacta gccaggaagt acttccacct 
3 961 cgggcacatt ttgggaagtt gcattccttt gtcttcaaac tgtgaagcat ttacagaaac 
4021 gcatccagca agaatattgt ccctttgagc agaaatttat ctttcaaaga ggtatatttg 
4081 aaaaaaaaaa aaaaagtata tgtgaggatt tttattgatt ggggatcttg gagtttttca 
4141 ttgtcgctat tgatttttac ttcaatgggc tcttccaaca aggaagaagc ttgctggtag 
4201 cacttgctac cctgagttca tccaggccca actgtgagca aggagcacaa gccacaagtc 

42 61 ttccagagga tgcttgattc cagtggttct gcttcaaggc ttccactgca aaacactaaa 
4321 gatccaagaa ggccttcatg gccccagcag gccggatcgg tactgtatca agtcatggca 

43 81 ggtacagtag gataagccac tctgtccctt cctgggcaaa gaagaaacgg aggggatgaa 
4441 ttcttcctta gacttacttt tgtaaaaatg tccccacggt acttactccc cactgatgga 
45 01 ccagtggttt ccagtcatga gcgttagact gacttgtttg tcttccattc cattgttttg 
4561 aaactcagta tgccgcccct gtcttgctgt catgaaatca gcaagagagg atgacacatc 
4621 aaataataac tcggattcca gcccacattg gattcatcag catttggacc aatagcccac 
4681 agctgagaat gtggaatacc taaggataac accgcttttg ttctcgcaaa aacgtatctc 
4741 ctaatttgag gctcagatga aatgcatcag gtcctttggg gcatagatca gaagactaca 
4801 aaaatgaagc tgctctgaaa tctcctttag ccatcacccc aaccccccaa aattagtttg 
4861 tgttacttat ggaagatagt tttctccttt tacttcactt caaaagcttt ttactcaaag 
4921 agtatatgtt ccctccaggt cagctgcccc caaaccccct ccttacgctt tgtcacacaa 
4981 aaagtgtctc tgccttgagt catctattca agcacttaca gctctggcca caacagggca 
5 041 ttttacaggt gcgaatgaca gtagcattat gagtagtgtg aattcaggta gtaaatatga 
5101 aactagggtt tgaaattgat aatgctttca caacatttgc agatgtttta gaaggaaaaa 
5161 agttccttcc taaaataatt tctctacaat tggaagattg gaagattcag ctagttagga 
5221 gcccattttt tcctaatctg tgtgtgccct gtaacctgac tggttaacag cagtcctttg 
52 81 taaacagtgt tttaaactct cctagtcaat atccacccca tccaatttat caaggaagaa 
5341 atggttcaga aaatattttc agcctacagt tatgttcagt cacacacaca tacaaaatgt 
5401 tccttttgct tttaaagtaa tttttgactc ccagatcagt cagagcccct acagcattgt 
5461 taagaaagta tttgattttt gtctcaatga aaataaaact atattcattt cc (SEQ ID 

NO: 53) 
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EGFR (NM_005228) 



RPSGTAGAALLALLAALCPASRALEEKKVCQGTSNKLTQLGTF 

DHFLSLQRMFNNCEWLGNLEITYVQRNYDLSFLKTIQEVAGYVLIALNTVERIPLE 
LQI I RGIMYYENS YALAVLSNYDANKTGLKELPMRNLQEILHGAVRFS1STNPALCNVE 
IQWRDIVSSDFLSlSnyiSMDFQNHLGSCQKCDPSCPNGSCWGAGEENCQKLTKIICAQQ 
SGRCRGKSPSDCCHNQCAAGCTGPRESDCLVCRKFRDEATCKDTCPPLMLYNPTTYQ 
DVNPEGKYSFGATCVKKCPRNYWTDHGSCVRACGADSYEMEEDGVRKCKKCEGPCR 
VCNGIGIGEFKDSLSINATNIKHFKNCTSISGDLHILPVAFRGDSFTHTPPLDPQEL 
ILKTVKEI TGFLLIQAWPENRTDLHAFENLE 1 1 RGRTKQHGQFSLAWSLNITSLGL 
SLKEI SDGDVI I SGNKNLCYANTINWKKLFGTSGQKTKI I SNRGENS CKATGQVCHA 
CSPEGCWGPEPRDCVSCRNVSRGRECVDKCNLLEGEPREFVENSECIQCHPECLPQA 
NITCTGRGPDNCIQCAHYIDGPHCVKTCPAGVMGENNTLVWKYADAGHVCHLCHPNC 
YGCTGPGLEGCPTNGPKIPSIATGMVGALLLLLWALGIGLFMRRRHIVRKRTLRRL 
QERELVEPLTPSGEAPNQALLRILKETEFKKIKVLGSGAFGTVYKGLWIPEGEKVKI 
VAIKELREATSPKANKEILDEAYVMASVDNPHVCRLLGICLTSTVQLITQLMPFGCL 
DYVREHKDNIGSQYLLlSmCVQIAKGMNYLEDRRLVHRDLAARNVLVKTPQHVKITDF 
LAKLLGAEEKEYHAEGGKVPIKWMALESILHRIYTHQSDVWSYGVTVWELMTFGSKP 
DGIPASEISSILEKGERLPQPPICTIDVYMIMVKCWMIDADSRPKFRELIIEFSKMA 
DPQRYLVIQGDERMHLPSPTDSNFYRALMDEEDMDDWDADEYLIPQQGFFSSPSTS 
TPLLSSLSATSNNSTVACIDRNGLQSCPIKEDSFLQRYSSDPTGALTEDSIDDTFLP 
PEYINQSVPKRPAGSVQNPVYHNQPLNPAPSRDPHYQDPHSTAVGNPEYLNTVQPTC 
NSTFDSPAHWAQKGSHQISLDNPDYQQDFFPKEAKPNGIFKGSTAENAEYLRVAPQS 
EFIGA (SEQ ID NO: 54) 
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EPHB2 (NM_004442) 

1 gccccgggaa gcgcagccat ggctctgcgg aggctggggg ccgcgctgct gctgctgccg 
61 ctgctcgccg ccgtggaaga aacgctaatg gactccacta cagcgactgc tgagctgggc 
121 tggatggtgc atcctccatc agggtgggaa gaggtgagtg gctacgatga gaacatgaac 
181 acgatccgca cgtaccaggt gtgcaacgtg tttgagtcaa gccagaacaa ctggctacgg 
241 accaagttta tccggcgccg tggcgcccac cgcatccacg tggagatgaa gttttcggtg 
3 01 cgtgactgca gcagcatccc cagcgtgcct ggctcctgca aggagacctt caacctctat 
3 61 tactatgagg ctgactttga ctcggccacc aagaccttcc ccaactggat ggagaatcca 
421 tgggtgaagg tggataccat tgcagccgac gagagcttct cccaggtgga cctgggtggc 
481 cgcgtcatga aaatcaacac cgaggtgcgg agcttcggac ctgtgtcccg cagcggcttc 
541 tacctggcct tccaggacta tggcggctgc atgtccctca tcgccgtgcg tgtcttctac 
601 cgcaagtgcc cccgcatcat ccagaatggc gccatcttcc aggaaaccct gtcgggggct 
661 gagagcacat cgctggtggc tgcccggggc agctgcatcg ccaatgcgga agaggtggat 
721 gtacccatca agctctactg taacggggac ggcgagtggc tggtgcccat cgggcgctgc 
781 atgtgcaaag caggcttcga ggccgttgag aatggcaccg tctgccgagg ttgtccatct 
841 gggactttca aggccaacca aggggatgag gcctgtaccc actgtcccat caacagccgg 
901 accacttctg aaggggccac caactgtgtc tgccgcaatg gctactacag agcagacctg 
961 gaccccctgg acatgccctg cacaaccatc ccctccgcgc cccaggctgt gatttccagt 
1021 gtcaatgaga cctccctcat gctggagtgg acccctcccc gcgactccgg aggccgagag 
1081 gacctcgtct acaacatcat ctgcaagagc tgtggctcgg gccggggtgc ctgcacccgc 
1141 tgcggggaca atgtacagta cgcaccacgc cagctaggcc tgaccgagcc acgcatttac 
12 01 atcagtgacc tgctggccca cacccagtac accttcgaga tccaggctgt gaacggcgtt 

12 61 actgaccaga gccccttctc gcctcagttc gcctctgtga acatcaccac caaccaggca 
1321 gctccatcgg cagtgtccat catgcatcag gtgagccgca ccgtggacag cattaccctg 

13 81 tcgtggtccc agccggacca gcccaatggc gtgatcctgg actatgagct gcagtactat 
1441 gagaaggagc tcagtgagta caacgccaca gccataaaaa gccccaccaa cacggtcacc 
1501 gtgcagggcc tcaaagccgg cgccatctat gtcttccagg tgcgggcacg caccgtggca 
1561 ggctacgggc gctacagcgg caagatgtac ttccagacca tgacagaagc cgagtaccag 
1621 acaagcatcc aggagaagtt gccactcatc atcggctcct cggccgctgg cctggtcttc 
16 81 ctcattgctg tggttgtcat cgccatcgtg tgtaacagaa gacgggggtt tgagcgtgct 
1741 gactcggagt acacggacaa gctgcaacac tacaccagtg gccacatgac cccaggcatg 
1801 aagatctaca tcgatccttt cacctacgag gaccccaacg aggcagtgcg ggagtttgcc 
1861 aaggaaattg acatctcctg tgtcaaaatt gagcaggtga tcggagcagg ggagtttggc 
1921 gaggtctgca gtggccacct gaagctgcca ggcaagagag agatctttgt ggccatcaag 
19 81 acgctcaagt cgggctacac ggagaagcag cgccgggact tcctgagcga agcctccatc 
2 041 atgggccagt tcgaccatcc caacgtcatc cacctggagg gtgtcgtgac caagagcaca 
2101 cctgtgatga tcatcaccga gttcatggag aatggctccc tggactcctt tctccggcaa 
2161 aacgatgggc agttcacagt catccagctg gtgggcatgc ttcggggcat cgcagctggc 
2221 atgaagtacc tggcagacat gaactatgtt caccgtgacc tggctgcccg caacatcctc 
22 81 gtcaacagca acctggtctg caaggtgtcg gactttgggc tctcacgctt tctagaggac 
2341 gatacctcag accccaccta caccagtgcc ctgggcggaa agatccccat ccgctggaca 
24 01 gccccggaag ccatccagta ccggaagttc acctcggcca gtgatgtgtg gagctacggc 
24 61 attgtcatgt gggaggtgat gtcctatggg gagcggccct actgggacat gaccaaccag 
2521 gatgtaatca atgccattga gcaggactat cggctgccac cgcccatgga ctgcccgagc 
2581 gccctgcacc aactcatgct ggactgttgg cagaaggacc gcaaccaccg gcccaagttc 
2641 ggccaaattg tcaacacgct agacaagatg atccgcaatc ccaacagcct caaagccatg 
2701 gcgcccctct cctctggcat caacctgccg ctgctggacc gcacgatccc cgactacacc 
2 761 agctttaaca cggtggacga gtggctggag gccatcaaga tggggcagta caaggagagc 
2821 ttcgccaatg ccggcttcac ctcctttgac gtcgtgtctc agatgatgat ggaggacatt 
28 81 ctccgggttg gggtcacttt ggctggccac cagaaaaaaa tcctgaacag tatccaggtg 
2 941 atgcgggcgc agatgaacca gattcagtct gtggaggttt gacattcacc tgcctcggct 
3001 cacctcttcc tccaagcccc gccccctctg ccccacgtgc cggccctcct ggtgctctat 
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3 061 ccactgcagg gccagccact cgccaggagg ccacgggcca cgggaagaac caagcggtgc 
3121 cagccacgag acgtcaccaa gaaaacatgc aactcaaacg acggaaaaaa aaagggaatg 
3181 ggaaaaaaga aaacagatcc tgggaggggg cgggaaatac aaggaatatt ttttaaagag 
3241 gattctcata aggaaagcaa tgactgttct tgcgggggat aaaaaagggc ttgggagatt 
33 01 catgcgatgt gtccaatcgg agacaaaagc agtttctctc caactccctc tgggaaggtg 

33 61 acctggccag agccaagaaa cactttcaga aaaacaaatg tgaaggggag agacaggggc 
3421 cgcccttggc tcctgtccct gctgctcctc taggcctcac tcaacaacca agcgcctgga 

34 81 ggacgggaca gatggacaga cagccaccct gagaacccct ctgggaaaat ctattcctgc 
3541 caccactggg caaacagaag aatttttctg tctttggaga gtattttaga aactccaatg 
3601 aaagacactg tttctcctgt tggctcacag ggctgaaagg ggcttttgtc ctcctgggtc 
3661 agggagaacg cggggacccc agaaaggtca gccttcctga ggatgggcaa cccccaggtc 
3 721 tgcagctcca ggtacatatc acgcgcacag cctggcagcc tggccctcct ggtgcccact 
3781 cccgccagcc cctgcctcga ggactgatac tgcagtgact gccgtcagct ccgactgccg 
3 841 ctgagaaggg ttgatcctgc atctgggttt gtttacagca attcctggac tcgggggtat 
3 901 tttggtcaca gggtggtttt ggtttagggg gtttgtttgt tgggttgttt tttgtttttt 
3 961 ggtttttttt aatgacaatg aagtgacact ttgacatttc ctaccttttg aggacttgat 
4021 ccttctccag gaagaaggtg ctttctgctt actgacttag gcaatacacc aagggcgaga 
4081 ttttatatgc acatttctgg atttttttat acggttttca ttgacactct tccctcctcc 
4141 cacctgccac caggcctcac caaagcccac tgccatgggg ccatctgggc cattcagaga 
42 01 ctggagtgag atttgggtgt ggagggggag gcgccaaggt ggaggagctt cccactccag 

42 61 gactgttgat gaaagggaca gattgaggag gaagtgggct ctgaggctgc agggctggaa 
4321 gtccttgccc acttcccact ctcctgcccc aatctatcta gtacttccca ggcaaatagg 

43 81 cccctttgag gctcctgagt gccctcagat ggtcaaaacc cagttttccc tctgggagcc 
4441 taaaccaggc tgcatcggag gccaggaccc ggatcattca ctgtgatacc ctgccctcca 
45 01 gagggtgcgc tcagagacac gggcaagcat gcctcttccc ttccctggag agaaagtgtg 
4561 tgatttctct cccacctcct tccccccacc agacctttgc tgggcctaaa ggtcttggcc 
4621 atggggacgc cctcagtcta gggatctggc cacagactcc ctcctgtgaa ccaacacaga 
4681 cacccaagca gagcaatcag ttagtgaatt g (SEQ ID NO: 55) 



FIGURE 3 IB 

EPHB2 (NM_004442) 
ALRRLGAALLLLPLLAAVEETLMDSTTATAELGWMVHPPSGWE 

VS GYDENMNT I RT YQVCNVFE S S QNNWLRTKF I RRRGAHR I HVEMKF S VRD C S S I P S 
PGSCKETFNLYYYEADFDSATKTFPNWMENPWKVDTIAADESFSQVDLGGRVMKIN 
EVRSFGPVSRSGFYLAFQDYGGCMSLIAVRVFYRKCPRI IQNGAI FQETLSGAESTS 
VAARGSCIANAEEVDVPIKLYCNGDGEWLVPIGRCMCKAGFEAVENGTVCRGCPSGT 
KANQGDEACTHCPINSRTTSEGATNCVCRNGYYRADLDPLDMPCTTIPSAPQAVISS 
NETSLMLEWTPPRDSGGREDLVYNIICKSCGSGRGACTRCGDNVQYAPRQLGLTEPR 
YISDLLAHTQYTFEIQAVNGVTDQSPFSPQFASVNITTNQAAPSAVSIMHQVSRTVD 
ITLSWSQPDQPNGVILDYELQYYEKELSEYNATAIKSPTNTVTVQGLKAGAIYVFQV 
ARTVAGYGRYSGKMYFQTMTEAEYQTSIQEKLPLIIGSSAAGLVFLIAWVIAIVCN 
RRGFERADSEYTDKLQHYTSGHMTPGMKIYIDPFTYEDPNEAVREFAKEIDISCVKI 
QVIGAGEFGEVCSGHLKLPGKREIFVAIKTLKSGYTEKQRRDFLSEASIMGQFDHPN 
IHLEGWTKSTPVMIITEFMENGSLDSFLRQNDGQFTVIQLVGMLRGIAAGMKYLAD 
NYVHRDLAARNILVNSNLVCKVSDFGLSRFLEDDTSDPTYTSALGGKIPIRWTAPEA 
QYRKFTSASDVWSYGIVMWEVMSYGERPYWDMTNQDVINAIEQDYRLPPPMDCPSAL 
QLMLDCWQKDRNHRPKFGQIVNTLDKMIRNPNSLKAMAPLSSGINLPLLDRTIPDYT 
FNTVDEWLEAI KMGQYKE S FANAGFTS FD WSQMMMED I LRVGVTL AGHQKKI LNS I 
VMRAQMNQ I Q S VE V (SEQ ID NO: 56) 
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CRIPTO CR-1 (NM__003212) 



1 ggagaatccc cggaaaggct gagtctccag ctcaaggtca aaacgtccaa ggccgaaagc 
61 cctccagttt cccctggacg ccttgctcct gcttctgcta cgaccttctg gggaaaacga 
121 atttctcatt ttcttcttaa attgccattt tcgctttagg agatgaatgt tttcctttgg 
181 ctgttttggc aatgactctg aattaaagcg atgctaacgc ctcttttccc cctaattgtt 
241 aaaagctatg gactgcagga agatggcccg cttctcttac agtgtgattt ggatcatggc 
3 01 catttctaaa gtctttgaac tgggattagt tgccgggctg ggccatcagg aatttgctcg 
361 tccatctcgg ggatacctgg ccttcagaga tgacagcatt tggccccagg aggagcctgc 
421 aattcggcct cggtcttccc agcgtgtgcc gcccatgggg atacagcaca gtaaggagct 
481 aaacagaacc tgctgcctga atgggggaac ctgcatgctg gggtcctttt gtgcctgccc 
541 tccctccttc tacggacgga actgtgagca cgatgtgcgc aaagagaact gtgggtctgt 
601 gccccatgac acctggctgc ccaagaagtg ttccctgtgt aaatgctggc acggtcagct 
661 ccgctgcttt cctcaggcat ttctacccgg ctgtgatggc cttgtgatgg atgagcacct 
721 cgtggcttcc aggactccag aactaccacc gtctgcacgt actaccactt ttatgctagt 
781 tggcatctgc ctttctatac aaagctacta ttaatcgaca ttgacctatt tccagaaata 
841 caattttaga tatcatgcaa atttcatgac cagtaaaggc tgctgctaca atgtcctaac 
901 tgaaagatga tcatttgtag ttgccttaaa ataatgaata caatttccaa aatggtctct 
961 aacatttcct tacagaacta cttcttactt ctttgccctg ccctctccca aaaaactact 
1021 tcttttttca aaagaaagtc agccatatct ccattgtgcc taagtccagt gtttcttttt 
1081 tttttttttt ttgagacgga gtctcactct gtcacccagg ctggactgca atgacgcgat 
1141 cttggttcac tgcaacctcc gcatccgggg ttcaagccat tctcctgcct aagcctccca 
12 01 agtaactggg attacaggca tgtgtcacca tgcccagcta atttttttgt attttagtag 

12 61 agatgggggt ttcaccatat tggccagtct ggtctcgaac tctgaccttg tgatccatcg 

13 21 atcagcctct cgagtgctga gattacacac gtgagcaact gtgcaaggcc tggtgtttct 
13 81 tgatacatgt aattctacca aggtcttctt aatatgttct tttaaatgat tgaattatat 
1441 gttcagatta ttggagacta attctaatgt ggaccttaga atacagtttt gagtagagtt 

15 01 gatcaaaatc aattaaaata gtctctttaa aaggaaagaa aacatcttta aggggaggaa 
1561 ccagagtgct gaaggaatgg aagtccatct gcgtgtgtgc agggagactg ggtaggaaag 
1621 aggaagcaaa tagaagagag aggttgaaaa acaaaatggg ttacttgatt ggtgattagg 

16 81 tggtggtaga gaagcaagta aaaaggctaa atggaagggc aagtttccat catctataga 
1741 aagctatata agacaagaac tccccttttt ttcccaaagg cattataaaa agaatgaagc 
18 01 ctccttagaa aaaaaattat acctcaatgt ccccaacaag attgcttaat aaattgtgtt 
1861 tcctccaagc tattcaattc ttttaactgt tgtagaagac aaaatgttca caatatattt 
1921 agttgtaaac caagtgatca aactacatat tgtaaagccc atttttaaaa tacattgtat 
1981 atatgtgtat gcacagtaaa aatggaaact atattgacct aaaaaaaaaa aaa (SEQ ID 

NO: 57) 



FIGURE 3 2A 



CRIPTO CR-1 (NM_003212) 



DCRKMARFSYSVIWIMAISKVFELGLVAGLGHQEFARPSRGYL 

FRDDSIWPQEEPAIRPRSSQRVPPMGIQHSKELNRTCCLNGGTCMLGSFCACPPSFY 
RNCEHDVRKENCGSVPHDTWIiPKKCSLCKCWHGQLRCFPQAFLPGCDGLVMDEHLVA 
RTPELPPSARTTTFMLVGICLSIQSYY (SEQ ID NO: 58) 
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Eprin Bl (NM_004429) 

1 gagtagacag cacagcggca gcggagggag tctatgcgag ctggacagca gtgggaggtt 
61 tgtgaggctc gcactggccg cagaccctcg ggctcgatcg cccgggagcc aggactcggc 
121 gacgcgaggc tgccgggcta cccggccgag gcttcggggg cgcaaactaa tgggactggc 
181 tcgctcggca gcatctcccc gctcttctaa gtacactgag cagggcccgc gctgaagtag 
241 aagctgtccg ggggcgcgta gcccggagtc ccagtgtggc ccggaggaac ggagcccgtg 
3 01 ccagggcggc ccagtcggga gcccggggac cgagcttgtg ctgtggggaa acccccactt 
361 cttccaaggg acagcgatcc cgggacggtc gaggcgtcgg ggcggtcacc gagacctctg 
421 cgggaagacc ccgtcgggga gagggcgcgc agccccgaag cgtctcggga agtcgagcgg 
481 aatcgggcgg gatcacccgg gggcgcagag cccccgtcgc gcctcgtgcg gcagcggaga 
541 gcccaggaga acgagccctc gggggccgaa gcccatgccc gggttggggg cggctgccca 
601 gtgagtcctc ctggccggcc gggcggagaa gagcgacacc gaagccggcg ggaggggagc 
661 acttcaaggc cggcggctgc ggaggatggg cgcctgagcg gctccgagcg cagcgcggca 
721 gaggaaggcg aggcgagctt tggtgaggag gcgccaaggg atcccgaagt gcagtctgcc 
781 cccgggaaga tggctcggcc tgggcagcgt tggctcggca agtggcttgt ggcgatggtc 
841 gtgtgggcgc tgtgccggct cgccacaccg ctggccaaga acctggagcc cgtatcctgg 
901 agctccctca accccaagtt cctgagtggg aagggcttgg tgatctatcc gaaaattgga 
961 gacaagctgg acatcatctg cccccgagca gaagcagggc ggccctatga gtactacaag 
1021 ctgtacctgg tgcggcctga gcaggcagct gcctgtagca cagttctcga ccccaacgtg 
1081 ttggtcacct gcaataggcc agagcaggaa atacgcttta ccatcaagtt ccaggagttc 
1141 agccccaact acatgggcct ggagttcaag aagcaccatg attactacat tacctcaaca 
12 01 tccaatggaa gcctggaggg gctggaaaac cgggagggcg gtgtgtgccg cacacgcacc 

12 61 atgaagatca tcatgaaggt tgggcaagat cccaatgctg tgacgcctga gcagctgact 

13 21 accagcaggc ccagcaagga ggcagacaac actgtcaaga tggccacaca ggcccctggt 
13 81 agtcggggct ccctgggtga ctctgatggc aagcatgaga ctgtgaacca ggaagagaag 
1441 agtggcccag gtgcaagtgg gggcagcagc ggggaccctg atggcttctt caactccaag 
1501 gtggcattgt tcgcggctgt cggtgccggt tgcgtcatct tcctgctcat catcatcttc 
1561 ctgacggtcc tactactgaa gctacgcaag cggcaccgca agcacacaca gcagcgggcg 
1621 gctgccctct cgctcagtac cctggccagt cccaaggggg gcagtggcac agcgggcacc 
1681 gagcccagcg acatcatcat tcccttacgg actacagaga acaactactg cccccactat 
1741 gagaaggtga gtggggacta cgggcaccct gtctacatcg tccaagagat gccgccccag 
18 01 agcccggcga acatctacta caaggtctga gtgcccggca cggcctcagg cccccgaggg 
1861 acagtcggcc tggaccggac ctctcctttc gcccccacac cccctcccct tgccagctgt 
1921 gcccaccttt gtatttagtt ttgtagtttc ttggctttta taatccccct ttttccctgc 
1981 cccctgggct tcggaggggg gtgcttgtgc ccctaacccc catgctcttg tgccttcccc 
2 041 ctctggccag gcctctgggc tccgtggggg cgccccttct tggaaggcag ggctggacac 
2101 tgatggacag caggcaggga gacagtcccc tggccctgcc cctccctcgc cccccttgcc 
2161 accttcccag gactgcttgt ccgctatcat cactgttttt aatgcttttg tgttcatttt 
2221 ttagctgtca actcattttc atctgttttt tgaagaaaaa tggaaaaatg taaaaggcag 
22 81 cccctcccca ggctttgtga gcctggccca agccagtaca agagggcctg gggcacgatg 
2341 tggtcagcca ggaagcatag gatgccattt cttttataga ttccttggta tttctggtgg 
24 01 ggtaaggggc aggccagggc tgt tcacgcc catgagggaa gaggaaagtg ccactgggca 
2461 aggtgtccca ccctcccctc ctgaccctcc tacgaggctt atcctggcaa tggggtagtc 
2521 actgccaccc ttccacacac acacacacac acacacacac aaaaaaaaat cccttccttg 
2581 tgggattctt gggcatctcc tgcctccctc actctcacgg taattaatgt cttaattggc 
2 641 tgttgcctgg ggaacaggag agctgctgca ggcagatgac ctcatggggg gtggagggag 
2701 gtgaggtgcc caggtggcta tttgccctgc agagctggga gtttcacccc caccccccac 
2761 cctgttctct ccttaccttt ggcatccttt ggcctggtgg ggaaacagag gcccagggtg 
2 821 gagacctaag cgggtataag accaggtggc ctgctccttt tctgggccct agcacaggtg 
2 881 ggtaaccccc acccaaccca gctcctgctg ctgtcccagt cttgggctgg ggcctggaaa 

2 941 gaggaagagg ctgcctgggg ctgggccagc ccgctgtgca ctttgacccc agttccttgc 

3 001 cagcacggct gctaacagac tgccacttga gtgcgccttg caggcactcc cagagcagcc 
3 061 atggaaggag ctggccctca caccatccac ctccacactg cctcctggcc agctgcccac 
3121 cccagtgcca ggtgggagag ggagcagaac agccagcccc ttccaggtgg cagtcggaag 
3181 ggtttttgtt tttgtttctg ttgccatttg tgtaaatact agtctttttg gaaaaaaaat 
3241 aatgtaaaga tgttttgtat aaactctgaa ttattttctt gttgcttttt tcttagaaaa 
33 01 aaatgagaac taaaaaaaaa aaattaacca catggaaaaa aaaaaa (SEQ ID NO: 59) 
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Eprin Bl (NM_0 04429) 



MARPGQRWLGKWLVAMVVWALCRLATPLAKNLEPVSWSSLNPKF 

LSGKGLVIYPKIGDKLDIICPRAEAGRPYEYYKLYLVRPEQAAACSTVLDPKTVLVTCN 
RPE QE I RPT I KFQEFS PNYMGL EFKKHHD YYI TS TSISTGSIiEGLENREGGVCRTRTMKI 
IMKVGQDPNAVTPEQLTTSRPSKEADNTVKMATQAPGSRGSLGDSDGKHETVNQEEKS 
GPGASGGSSGDPDGFFNSKVALFAAVGAGCVIFLLI I I FLTVLLLKLRKRHRKHTQQR 
AAALSLSTLASPKGGSGTAGTEPSDI 1 1 PLRTTENNYCPHYEKVSGDYGHPVYIVQEM 
PPQSPANIYYKV (SEQ ID WO: 60) 
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MMP-17/MT4-MMP (NM_0 16155 ) 

1 ccggcggggg cgccgcggag agcggagggc gccgggctgc ggaacgcgaa gcggagggcg 
61 cgggaccctg cacgccgccc gcgggcccat gtgagcgcca tgcggcgccg cgcagcccgg 
121 ggacccggcc cgccgccccc agggcccgga ctctcgcggt tgccgctgct gccgctgccg 
181 ctgctgctgc tgctggcgct ggggacccgc gggggctgcg ccgcgcccgc acccgcgccg 
241 cgcgccgagg acctcagcct gggagtggag tggctaagca ggttcggtta cctgcccccg 
3 01 gctgacccca caacagggca gctgcagacg caagaggagc tgtctaaggc catcacagcc 
3 61 atgcagcagt ttggtggcct ggaggccacc ggcatcctgg acgaggccac cctggccctg 
421 atgaaaaccc cacgctgctc cctgccagac ctccctgtcc tgacccaggc tcgcaggaga 
481 cgccaggctc cagcccccac caagtggaac aagaggaacc tgtcgtggag ggtccggacg 
541 ttcccacggg actcaccact ggggcacgac acggtgcgtg cactcatgta ctacgccctc 
601 aaggtctgga gcgacattgc gcccctgaac ttccacgagg tggcgggcag caccgccgac 
661 atccagatcg acttctccaa ggccgaccat aacgacggct accccttcga cggccccggc 
721 ggcaccgtgg cccacgcctt cttccccggc caccaccaca ccgccgggga cacccacttt 
781 gacgatgacg aggcctggac cttccgctcc tcggatgccc acgggatgga cctgtttgca 
841 gtggctgtcc acgagtttgg ccacgccatt gggttaagcc atgtggccgc tgcacactcc 
901 atcatgcggc cgtactacca gggcccggtg ggtgacccgc tgcgctacgg gctcccctac 
961 gaggacaagg tgcgcgtctg gcagctgtac ggtgtgcggg agtctgtgtc tcccacggcg 
1021 cagcccgagg agcctcccct gctgccggag cccccagaca accggtccag cgccccgccc 
1081 aggaaggacg tgccccacag atgcagcact cactttgacg cggtggccca gatccgcggt 
1141 gaagctttct tcttcaaagg caagtacttc tggcggctga cgcgggaccg gcacctggtg 

12 01 tccctgcagc cggcacagat gcaccgcttc tggcggggcc tgccgctgca cctggacagc 
1261 gtggacgccg tgtacgagcg caccagcgac cacaagatcg tcttctttaa aggagacagg 
1321 tactgggtgt tcaaggacaa taacgtagag gaaggatacc cgcgccccgt ctccgacttc 

13 81 agcctcccgc ctggcggcat cgacgctgcc ttctcctggg cccacaatga caggacttat 
1441 ttctttaagg accagctgta ctggcgctac gatgaccaca cgaggcacat ggaccccggc 
15 01 taccccgccc agagccccct gtggaggggt gtccccagca cgctggacga cgccatgcgc 
15 61 tggtccgacg gtgcctccta cttcttccgt ggccaggagt actggaaagt gctggatggc 
1621 gagctggagg tggcacccgg gtacccacag tccacggccc gggactggct ggtgtgtgga 
1681 gactcacagg ccgatggatc tgtggctgcg ggcgtggacg cggcagaggg gccccgcgcc 
1741 cctccaggac aacatgacca gagccgctcg gaggacggtt acgaggtctg ctcatgcacc 

18 01 tctggggcat cctctccccc gggggcccca ggcccactgg tggctgccac catgctgctg 
1861 ctgctgccgc cactgtcacc aggcgccctg tggacagcgg cccaggccct gacgctatga 
1921 cacacagcgc gagcccatga gaggacagag gcggtgggac agcctggcca cagagggcaa 

19 81 ggactgtgcc ggagtccctg ggggaggtgc tggcgcggga tgaggacggg ccaccctggc 
2 041 accggaaggc cagcagaggg cacggcccgc cagggctggg caggctcagg tggcaaggac 
2101 ggagctgtcc cctagtgagg gactgtgttg actgacgagc cgaggggtgg ccgctccaga 
2161 agggtgccca gtcaggccgc accgccgcca gcctcctccg gccctggagg gagcatctcg 
2221 ggctgggggc ccacccctct ctgtgccggc gccaccaacc ccacccacac tgctgcctgg 
2281 tgctcccgcc ggcccacagg gcctccgtcc ccaggtcccc agtggggcag ccctccccac 
2341 agacgagccc cccacatggt gccgcggcac gtcccccctg tgacgcgttc cagaccaaca 
2401 tgacctctcc ctgctttgta aaaaaaaaaa aaaaaaaa (SEQ ID NO: 61) 
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MMP-17/MT4-MMP (NM_016155) 



MRRRAARGPGPPPPGPGLSRLPLLPLPLLLLLALGTRGGCAAP^ 

PAPRAEDLSLGVEWLSRFGYIiPPADPTTGQLQTQEELSKAITAMQQFGGLEATGILDE 
ATLALMKTPRCSLPDLPVLTQARRRRQAPAPTKWNKRNLSWRVRTFPRDSPLGHDTVR 
ALMYYALKVWSDIAPLNFHEVAGSTADIQIDFSKADHNDGYPFDGPGGTVAHAFFPGH 
HHTAGDTHFDDDEAWTFRSSDAHGMDLFAVAVHEFGHAIGLSHVAAAHSIMRPYYQGP 
VGDPLRYGLPYEDKVRVWQliYGVRESVSPTAQPEEPPLLPEPPDNRSSAPPRKDVPHR 
C STHFDAVAQI RGE AFFFKGKYF WRLTRDRHLVS LQ PAQMHRFWRGL PLHLDS VDAVY 
ERTSDHKIVFFKGDRYWFKD]S^EEGYPRPVSDFSIiPPGGIDAAFSWAHXsrDRTYFFK 
DQLYWRYDDHTRHMDPGYPAQSPLWRGVPSTLDDAMRWSDGASYFFRGQEYWKVLDGE 
LEVAPGYPQSTARDWLVCGDSQADGSVAAGVDAAEGPRAPPGQHDQSRSEDGYEVCSC 
TSGASSPPGAPGPLVAATMLLLLPPLSPGAXiWTAAQALTL (SEQ ID WO: 62) 



FIGURE 34B 



WO 2004/044178 



PCT/US2003/036260 



54/1 1 5 



MMP2 6 (NM_021801) 

1 gacaaatgag ggtttggcat gcagctcgtc atcttaagag ttactatctt cttgccctgg 
61 tgtttcgccg ttccagtgcc ccctgctgca gaccataaag gatgggactt tgttgagggc 
121 tatttccatc aatttttcct gaccgagaag gagtcgccac tccttaccca ggagacacaa 
181 acacagctcc tgcaacaatt ccatcggaat gggacagacc tacttgacat gcagatgcat 
241 gctctgctac accagcccca ctgtggggtg cctgatgggt ccgacacctc catctcgcca 
3 01 ggaagatgca agtggaataa gcacactcta acttacagga ttatcaatta cccacatgat 

3 61 atgaagccat ccgcagtgaa agacagtata tataatgcag tttccatctg gagcaatgtg 
421 acccctttga tattccagca agtgcagaat ggagatgcag acatcaaggt ttctttctgg 

4 81 cagtgggccc atgaagatgg ttggcccttt gatgggccag gtggtatctt aggccatgcc 
541 tttttaccaa attctggaaa tcctggagtt gtccattttg acaagaatga acactggtca 
601 gcttcagaca ctggatataa tctgttcctg gttgcaactc atgagattgg gcattctttg 
661 ggcctgcagc actctgggaa tcagagctcc ataatgtacc ccacttactg gtatcacgac 
721 cctagaacct tccagctcag tgccgatgat atccaaagga tccagcattt gtatggagaa 
781 aaatgttcat ctgacatacc ttaatgttag cacagaggac ttattcaacc tgtcctttca 
841 gggagtttat tggaggatca aagaactgaa agcactagag cagccttggg gactgctagg 
9 01 atgaagccct aaagaatgca acctagtcag gttagctgaa ccgacactca aaacgctact 
961 gagtcacaat aaagattgtt ttaaagagta aaaaaaaaaa aaaaaaaaa (SEQ ID 

NO: 63) 



FIGURE 35A 



MMP26 (NM 021801) 



MQLVILRVTIFLPWCFAVPVPPAADHKGWDFVEGYFHQFFLTEK 

SPLLTQETQTQLLQQFHRNGTDLLDMQMHALLHQPHCGVPDGSDTSISPGRCKWNKH 
LTYRI INYPHDMKPS AVKDS I YNAVS I WSNVTPL I FQQVQNGDADI KVS FWQWAHED 
WPFDGPGGILGHAFLPNSGNPGWHFDKNEHWSASDTGYNLFLVATHEIGHSLGLQH 
GNQ SSI MYPT YW YHD PRTFQLS ADD I QR I QHL YGEKC S S D I P (SEQ ID NO: 64) 
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ADAM10 (NM_001110) 
1 gaattcgagg atccgggtac catgggcggc ggcaggccta gcagcacggg aaccgtcccc 
61 cgcgcgcatg cgcgcgcccc tgaagcgcct gggggacggg tatgggcggg aggtaggggc 
121 gcggctccgc gtgccagttg ggtgcccgcg cgtcacgtgg tgaggaagga ggcggaggtc 
181 tgagtttcga gggagggggg gagagaagag ggaacgagca agggaaggaa agcggggaaa 
241 ggaggaagga aacgaacgag ggggagggag gtccctgttt tggaggagct aggagcgttg 
3 01 ccggcccctg aagtggagcg agagggaggt gcttcgccgt ttctcctgcc aggggaggtc 
3 61 ccggcttccc gtggaggctc cggaccaagc cccttcagct tctccctccg gatcgatgtg 
421 ctgctgttaa cccgtgagga ggcggcggcg gcggcagcgg cagcggaaga tggtgttgct 
481 gagagtgtta attctgctcc tctcctgggc ggcggggatg ggaggtcagt atgggaatcc 
541 tttaaataaa tatatcagac attatgaagg attatcttac aatgtggatt cattacacca 
601 aaaacaccag cgtgccaaaa gagcagtctc acatgaagac caatttttac gtctagattt 
661 ccatgcccat ggaagacatt tcaacctacg aatgaagagg gacacttccc ttttcagtga 
721 tgaatttaaa gtagaaacat caaataaagt acttgattat gatacctctc atatttacac 
781 tggacatatt tatggtgaag aaggaagttt tagccatggg tctgttattg atggaagatt 
841 tgaaggattc atccagactc gtggtggcac attttatgtt gagccagcag agagatatat 
901 taaagaccga actctgccat ttcactctgt catttatcat gaagatgata ttaactatcc 
961 ccataaatac ggtcctcagg ggggctgtgc agatcattca gtatttgaaa gaatgaggaa 
1021 ataccagatg actggtgtag aggaagtaac acagatacct caagaagaac atgctgctaa 
1081 tggtccagaa cttctgagga aaaaacgtac aacttcagct gaaaaaaata cttgtcagct 
1141 ttatattcag actgatcatt tgttctttaa atattacgga acacgagaag ctgtgattgc 

12 01 ccagatatcc agtcatgtta aagcgattga tacaatttac cagaccacag acttctccgg 
1261 aatccgtaac atcagtttca tggtgaaacg cataagaatc aatacaactg ctgatgagaa 
1321 ggaccctaca aatcctttcc gtttcccaaa tattggtgtg gagaagtttc tggaattgaa 

13 81 ttctgagcag aatcatgatg actactgttt ggcctatgtc ttcacagacc gagattttga 
1441 tgatggcgta cttggtctgg cttgggttgg agcaccttca ggaagctctg gaggaatatg 
15 01 tgaaaaaagt aaactctatt cagatggtaa gaagaagtcc ttaaacactg gaattattac 
1561 tgttcagaac tatgggtctc atgtacctcc caaagtctct cacattactt ttgctcacga 
1621 agttggacat aactttggat ccccacatga ttctggaaca gagtgcacac caggagaatc 
1681 taagaatttg ggtcaaaaag aaaatggcaa ttacatcatg tatgcaagag caacatctgg 
1741 ggacaaactt aacaacaata aattctcact ctgtagtatt agaaatataa gccaagttct 
18 01 tgagaagaag agaaacaact gttttgttga atctggccaa cctattfcgtg gaaatggaat 
18 61 ggtagaacaa ggtgaagaat gtgattgtgg ctatagtgac cagtgtaaag atgaatgctg 
1921 cttcgatgca aatcaaccag agggaagaaa atgcaaactg aaacctggga aacagtgcag 
1981 tccaagtcaa ggtccttgtt gtacagcaca gtgtgcattc aagtcaaagt ctgagaagtg 
2 041 tcgggatgat tcagactgtg caagggaagg aatatgtaat ggcttcacag ctctctgccc 
2101 agcatctgac cctaaaccaa acttcacaga ctgtaatagg catacacaag tgtgcattaa 
2161 tgggcaatgt gcaggttcta tctgtgagaa atatggctta gaggagtgta cgtgtgccag 
2221 ttctgatggc aaagatgata aagaattatg ccatgtatgc tgtatgaaga aaatggaccc 
2281 atcaacttgt gccagtacag ggtctgtgca gtggagtagg cacttcagtg gtcgaaccat 
2341 caccctgcaa cctggatccc cttgcaacga ttttagaggt tactgtgatg ttttcatgcg 
2401 gtgcagatta gtagatgctg atggtcctct agctaggctt aaaaaagcaa tttttagtcc 
2461 agagctctat gaaaacattg ctgaatggat tgtggctcat tggtgggcag tattacttat 
2521 gggaattgct ctgatcatgc taatggctgg atttattaag atatgcagtg ttcatactcc 
2581 aagtagtaat ccaaagttgc ctcctcctaa accacttcca ggcactttaa agaggaggag 
2 641 acctccacag cccattcagc aaccccagcg tcagcggccc cgagagagtt atcaaatggg 
2 701 acacatgaga cgctaactgc agcttttgcc ttggttcttc ctagtgccta caatgggaaa 
2761 acttcactcc aaagagaaac ctattaagtc atcatctcca aactaaaccc tcacaagtaa 
2 821 cagttgaaga aaaaatggca agagatcata tcctcagacc aggtggaatt acttaaattt 
2 8 81 taaagcctga aaattccaat ttgggggtgg gaggtggaaa aggaacccaa ttttcttatg 

2 941 aacagatatt tttaacttaa tggcacaaag tcttagaata ttattatgtg ccccgtgttc 

3 001 cctgttcttc gttgctgcat tttcttcact tgcaggcaaa cttggctctc aataaacttt 
3 061 taccacaaat tgaaataaat atattttttt caactgccaa tcaaggctag gaggctcgac 
3121 cacctcaaca ttggagacat cacttgccaa tgtacatacc ttgttatatg cagacatgta 
3181 tttcttacgt acactgtact tctgtgtgca attgtaaaca gaaattgcaa tatggatgtt 
3241 tctttgtatt ataaaatttt tccgctctta attaaaaatt actgtttaat tgacatactc 
3301 aggataacag agaatggtgg tattcagtgg tccaggattc tgtaatgctt tacacaggca 
3361 gttttgaaat gaaaatcaat ttaccccatg gtacccggat cctcgaattc (SEQ ID 

NO:65) 
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AD AMI 0 (NM__001110) 



VLLRVLILLLSWAAGMGGQYGNPLNKYIRHYEGLSYNVDSLHQ 

HQRAKRAVSHEDQFLRLDFHAHGRHFNLRMKRDTSLFSDEFKVETSNKVLDYDTSHI 

TGHI YGEEGS FSHGS VI DGRFEGF I QTRGGTF YVE PAER YI KDRTLPFHS VI YHEDD 

NYPHKYGPQGGCADHSVFERMRKYQMTGVEEVTQIPQEEHAANGPELLRKKRTTSAE 

NTCQLYIQTDHLFFKYYGTREAVIAQISSHVKAIDTIYQTTDFSGIRNISFMVKRIR 

NTTADEKDPTNPFRFPNIGVEKFLELNSEQNHDDYCLAYVFTDRDFDDGVLGLAWVG 

PSGSSGGICEKSKLYSDGKKKSLNTGIITVQNYGSHVPPKVSHITFAHEVGHNFGSP 

DSGTECTPGESIOSrLGQKENGNYIMYARATSGDKLNNNKFSLCSIRNISQVL 

FVESGQPICGNGMVEQGEECDCGYSDQCKDECCFDANQPEGRKCKLKPGKQCSPSQG 

CCTAQCAFKSKSEKCRDDSDCAREGICNGFTALCPASDPKPNFTDCNRHTQVCINGQ 

AGSICEKYGLEECTCASSDGKDDKELCHVCCMKKMDPSTCASTGSVQWSRHFSGRTI 

LQPGSPCNDFRGYCDVFMRCRLVDADGPLARIiKKAIFSPELYENIAEWIVAHWWAVIi 

MGIALIMLMAGFIKICSVHTPSSNPKLPPPKPLPGTLKRRRPPQPIQQPQRQRPRES 

QMGHMRR (SEQ ID NO: 66) 



FIGURE 3 6B 



WO 2004/044178 



PCT/US2003/036260 



57/1 1 5 

AD AMI (XM 132370) 

1 cttgggtggg cagtgcaagc caactgcagt cagcaagtgt gcgggcttaa gagttcttcc 
61 agagcccact tccattttct ttgttgcttt aactagagtc accagtctgt cttcattttt 
121 atggtgagac cattgggaga actaacttag attttaggct ctaatatagt tctgtggtaa 
181 aaataagatc atgtaacact tatgctttag aaatttccat agagaaggat catgtcttaa 
241 agccaaaatt tatttggtag acacaaggat acgggaaagt agaacatcta aatactgtgt 
3 01 gtgtgtgcgt gtgcgtgtgc gtgtgtgtgt acaccagtga aaggaatcag gcagtctaag 
361 agaactagct atccatccag catgaccact gtaagaatga ggaatgaggc aggacaacag 
421 agaactctta attgttcaga gaacccagag aactttgtcc cctcccccga aaccctgcag 
481 aatgttgagt ctgaaagtat gagctggtta acatgtcagg ggcccatgac ctgtggagga 
541 ggaaagatga tgtgacaagc acagaaccgg ctgagccact gtagatgcag ggctcatctc 
601 catgaatgtc aaaggaactt aagcaacact gaagctcctc cacttgaaag aagcccctgt 
661 gctgcacata tccaccaagg ccaggagaaa gaaaggagag agacacagcc tgagaccgca 
721 cagtttcttg ggaagctccc cagtaaggca cgggcacagg tctgggtgcc tgggtctggg 
781 aaaagcagag agcactgccg ctgatggaca gagatcctcc atcatcagca gtttgttgga 
841 gccatgtcag tggcagcagc ggggagaggg tttgcctcca gtctgtcttc cccacagatc 
901 aggcgaatag ccttaaaaga agctaagcta acacctcaca tctgggcggc actgcactgg 
961 aacttgggac tgagactagt gccatctgtc agagtaggga ttttggtgct actgattttt 
1021 ctcccgagca cgttctgtga cattggatct gtatataatt cttcctatga aactgtcatc 
10 81 cctgagagac tgccaggcaa gggggggaaa gaccctggag ggaaggtgtc ctacatgcta 
1141 ttgatgcaag gccaaaagca gctgcttcac ctcgaggtaa agggacacta ccctgagaat 
12 01 aacttcccag tctacagtta ccacaatggc atcctgaggc aagaaatgcc tctcctctcc 

12 61 caggactgcc actatgaagg ctacatggaa ggggtgccag gctcctttgt ttctgtcaac 
1321 atctgttcag gcctcagggg ggtcttgatt aaagaggaaa catcctatgg cattgagccc 

13 81 atgctctctt ccaaaaactt tgaacatgtc ctctacacca tggagcatca gcctgtggtc 
1441 tcctgcagtg tcactcccaa agacagccct ggggacacca gccatccacc aaggagcagg 
15 01 aagcccgatg acctactggt tctgactgac tggtggtcac acaccaagta tgtggagatg 

15 61 tttgtggtgg tcaaccacca gcggttccag atgtggggca gtaacatcaa cgagacggtc 
1621 caggcagtaa tggacatcat tgctctggcc aacagcttca ctagggggat aaacacagag 

16 81 gtggtgctgg tgggcctgga aatctggaca gagggggacc cgatagaggt cccagtggac 
1741 ctgcagacca cactcaggaa tttcaacttc tggagacagg agaaactcgt gggccgggtc 
18 01 aggcacgatg tggcacactt gatcgtcggg catcgcccag gagagaacga gggccaggcg 

18 61 tttctccgtg gtgcctgttc gggtgagttt gcggcggccg tggaggcctt ccatcatgaa 
1921 gatgtcctcc tgttcgcggc tctcatggcc cacgagctcg ggcacaacct gggtatccag 

19 81 cacgaccacc cgacctgcac ctgtggtccc aagcacttct gcctcatggg tgagaagatc 
2041 ggtaaggaca gtggcttcag caactgcagc tctgaccact tcctccgttt cctccatgac 
2101 cacagagggg cgtgcctgct tgatgagcct gggcgccaga gccgcatgcg cagagctgcc 
2161 aattgtggga atggtgtggt ggaggacttg gaggagtgtg actgcggcag tgactgtgac 
2221 agtcacccgt gctgttcgcc aacatgtacg cttaaggagg gtgcgcagtg cagtgaggga 
22 81 ctctgctgct acaactgtac attcaagaag aaagggagct tatgccgtcc tgctgaggat 
2341 gtgtgtgacc ttcccgagta ttgtgacggc agtactcagg aatgccctgc aaacagctac 
2401 atgcaggatg gcacacagtg tgataggatt tattactgct tggggggttg gtgtaagaac 
2461 cctgataaac aatgttcaag gatctatggg tatcctgcaa gatctgcccc tgaggaatgt 
2521 tacatttcag ttaatactaa ggcgaaccgg tttggaaact gtggccatcc cacctccgct 
25 81 aacttcagat atgaaacatg ttccgatgag gatgtatttt gtgggaaact ggtgtgtaca 

2 641 gatgttagat acctgcccaa agtcaaaccc ctacactcac tcctccaggt tccttatgga 
2701 gaggactggt gttggagtat ggatgcctat aacatcacag atgtcccgga tgacggagat 
2 761 gtacagagcg gcaccttctg tgccccaaac aaagtctgca tggagtatat ctgcactggt 
2821 cgtggggtgc tccagtacaa ctgtgagcca caggaaatgt gtcacgggaa tggagtgtgc 
2881 aacaatttca agcactgtca ctgcgatgct ggcttcgccc ctcctgactg tagcagtcca 

2 941 ggaaatgggg ggagtgtgga cagtggtcct gttggtaagc ccgctgatcg acacttgagt 

3 001 ctctcttttc tggctgaaga gagtccagat gataaaatgg aggatgaaga ggtaaacctg 
3 061 aaagtgatgg tgcttgtggt ccctatattt cttgtcgttt tactgtgctg tctaatgctg 
3121 atcgcctacc tctggtctga agtacaagaa gtagtatctc caccgagttc atcagagtct 
3181 tcgtcttcat catcctggtc agactctgac tctcagtgaa gttttattta agatcctctc 
3241 atggatcatt gctatcgatg tcttgtattt gcagggcaat tttgcctaag tggattttag 
33 01 ggcatgctgt tcagtgtaat gtgtggtcta tatacttgtg ttgctcatct cagaaacaac 
3361 tggaattata tcctgaatga tgttaaggga tctaaatgtt ctaacttgcc ctgtcagctc 
3421 ctgttcataa aatagaaggc attttaaata aatataaa (SEQ ID NO: 67} 
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ADAM1 (XM__132370) 



MSVAAAGRGFASSIiSSPQIRRIALKEAKLTPHIWAALHWNLGLR 

LVPSVRVGILVLLIFLPSTFCDIGSVYNSSYETVIPERLPGKGGKDPGGKVSYMLLMQ 
GQKQLLHLEVKGHYPENNFPVYSYHNGILRQEMPLLSQDCHYEGYMEGVPGSFVSVNI 
CSGLRGVLIKEETSYGIEPMLSSKNFEHVLYTMEHQPWSCSVTPKDSPGDTSHPPRS 
RKPDDLLVLTDWWSHTKYVEMFVWNHQ 

NTEWLVGLEIWTEGDPIEVPVDLQTTLRNFNFWRQEKLVGRVRHDVAHLIVGHRPGE 
NEGQAFIiRGACSGEFAAAVEAFHHEDVLLFAALMAHELGHNLGIQHDHPTCTCGPKHF 
CLMGEKlGKDSGFSNCSSDHFLRFLHDHRGACLLDEPGRQSRMRRAAISrCGNGVVEDLE 
ECDCGSDCDSHPCCSPTCTLKEGAQCSEGLCCYNCTFKKKGSIjCRPAEDVCDLPEYCD 
GSTQECPANSYMQDGTQCDRIYYCLGGWCKiSrPDKQCSRIYGYPARSAPEECYISVNTK 
ANRFGNCGHPTSANFRYETCSDEDVFCGKLVCTDVRYLPKVKPLHSLLQVPYGEDWCW 
SMDAYNI TDVPDDGDVQ SGTF CAPNKVCME Y I CTGRGVLQYNCE PQEMCHGNGVCNNF 
KHCHCDAGFAPPDCSSPGNGGSVDSGPVGKPADRHIiSLSFLAEESPDDKMEDEEVNLK: 
VMVLWPIFIiWLLCCLMLIAYXiWSEVQEWSPPSSSESSSSSSWSDSDSQ (SEQ ID NO: 68) 
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TIMl (NM_003254) 



1 aggggcctta gcgtgccgca tcgccgagat ccagcgccca gagagacacc agagaaccca 

61 ccatggcccc ctttgagccc ctggcttctg gcatcctgtt gttgctgtgg ctgatagccc 

121 ccagcagggc ctgcacctgt gtcccacccc acccacagac ggccttctgc aattccgacc 

181 tcgtcatcag ggccaagttc gtggggacac cagaagtcaa ccagaccacc ttataccagc 

241 gttatgagat caagatgacc aagatgtata aagggttcca agccttaggg gatgccgctg 

3 01 acatccggtt cgtctacacc cccgccatgg agagtgtctg cggatacttc cacaggtccc 

361 acaaccgcag cgaggagttt ctcattgctg gaaaactgca ggatggactc ttgcacatca 

421 ctacctgcag tttcgtggct ccctggaaca gcctgagctt agctcagcgc cggggcttca 

481 ccaagaccta cactgttggc tgtgaggaat gcacagtgtt tccctgttta tccatcccct 

541 gcaaactgca gagtggcact cattgcttgt ggacggacca gctcctccaa ggctctgaaa 

601 agggcttcca gtcccgtcac cttgcctgcc tgcctcggga gccagggctg tgcacctggc 

661 agtccctgcg gtcccagata gcctgaatcc tgcccggagt ggaactgaag cctgcacagt 

721 gtccaccctg ttcccactcc catctttctt ccggacaatg aaataaagag ttaccaccca 
781 gc (SEQ ID NO: 69) 



FIGURE 3 8A 



TIMl (NM 003254) 



APFEPLASGILLLLWLIAPSRACTCVPPHPQTAFCNSDLVIRA 

FVGTPE VNQTTLYQRYE I KMTKM YKGFQALGDAAD I R F VYT PAME S VCG YFHRS HNR 
EEFLIAGKLQDGLLHITTCSFVAPWNSLSIiAQRRGFTKTYTVGCEECTVFPCLSIPC 
LQSGTHCLWTDQLLQGSEKGFQSRHLACLPREPGLCTWQSLRSQIA (SEQ ID NO: 70) 
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MUCl (XM_053256) 

1 cgctccacct ctcaagcagc cagcgcctgc ctgaatctgt tctgccccct ccccacccat 
61 ttcaccacca ccatgacacc gggcacccag tctcctttct tcctgctgct gctcctcaca 
121 gtgcttacag ttgttacagg ttctggtcat gcaagctcta ccccaggtgg agaaaaggag 
181 acttcggcta cccagagaag ttcagtgccc agctctactg agaagaatgc tgtgagtatg 
241 accagcagcg tactctccag ccacagcccc ggttcaggct cctccaccac tcagggacag 
3 01 gatgtcactc tggccccggc cacggaacca gcttcaggtt cagctgccac ctggggacag 
3 61 gatgtcacct cggtcccagt caccaggcca gccctgggct ccaccacccc gccagcccac 
421 gatgtcacct cagccccgga caacaagcgg gcccggggct ccaccgcccc cccagcccac 
481 ggtgtcacct cggccccgga caccaggccg gccccgggct ccaccgcccc cccagcccat 
541 ggtgtcacct cggccccgga caacaggccc gccttgggct ccaccgcccc tccagtccac 
601 aatgtcacct cggcctcagg ctctgcatca ggctcagctt ctactctggt gcacaacggc 
661 acctctgcca gggctaccac aaccccagcc agcaagagca ctccattctc aattcccagc 
721 caccactctg atactcctac cacccttgcc agccatagca ccaagactga tgccagtagc 
781 actcaccata gcacggtacc tcctctcacc tcctccaatc acagcacttc tccccagttg 
841 tctactgggg tctctttctt tttcctgtct tttcacattt caaacctcca gtttaattcc 
901 tctctggaag atcccagcac cgactactac caagagctgc agagagacat ttctgaaatg 
961 tttttgcaga tttataaaca agggggtttt ctgggcctct ccaatattaa gttcaggcca 
1021 ggatctgtgg tggtacaatt gactctggcc ttccgagaag gtaccatcaa tgtccacgac 
10 81 gtggagacac agttcaatca gtataaaacg gaagcagcct ctcgatataa cctgacgatc 
1141 tcagacgtca gcgtgagtga tgtgccattt cctttctctg cccagtctgg ggctggggtg 

12 01 ccaggctggg gcatcgcgct gctggtgctg gtctgtgttc tggttgcgct ggccattgtc 
1261 tatctcattg ccttggctgt ctgtcagtgc cgccgaaaga actacgggca gctggacatc 
1321 tttccagccc gggataccta ccatcctatg agcgagtacc ccacctacca cacccatggg 

13 81 cgctatgtgc cccctagcag taccgatcgt agcccctatg agaaggtttc tgcaggtaat 
1441 ggtggcagca gcctctctta cacaaaccca gcagtggcag ccacttctgc caacttgtag 
15 01 gggcacgtcg cccgctgagc tgagtggcca gccagtgcca ttccactcca ctcaggttct 
1561 tcagggccag agcccctgca ccctgtttgg gctggtgagc tgggagttca ggtgggctgc 
1621 tcacagcctc cttcagaggc cccaccaatt tctcggacac ttctcagtgt gtggaagctc 
1681 atgtgggccc ctgagggctc atgcctggga agtgttgtgg tgggggctcc caggaggact 
1741 ggcccagaga gccctgagat agcggggatc ctgaactgga ctgaataaaa cgtggtctcc 
1801 cactg (SEQ ID NO: 71) 



FIGURE 3 9A 



MUCl (XM_053256) 



MTPGTQS PFFLLLLLTVLT WTGSGHAS STPGGEKETSATQRS S 

VPSSTEKNAVSMTSSVLSSHSPGSGSSTTQGQDVTLAPATEPASGSAATWGQDVTSVP 
VTRPALGSTTPPAHDVTSAPDNKRARGSTAPPAHGVTSAPDTRPAPGSTAPPAHGVTS 
APDNRPALGSTAPPVHNVTSASGSASGSASTLVHNGTSARATTTPASKSTPFSIPSHH 
S DTPTTLASHS TKTDAS S THHS TVPPLTS SNHSTS PQLS TGVSFFFliSFHI SNLQFNS 

SLEDPSTDYYQELQRDISEMFLQIYKQGGFLGLSNIKFRPGSVWQLTLAFREGTINV 
HDVETQFNQ YKTEAAS R YNLT I S DVS VS DVP F PFS AQ S GAGVPGWG I ALLVLVCVLVA 
IAI VYLI ALAVCQCRRKNYGQLDI FPARDTYHPMSEYPTYHTHGRYVPPS STDRS PYE 
KVS AGNGGS S L S YTNPAVAATS ANL (SEQ ID NO: 72) 
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CEA (NM_0 043 63) 

1 ctcagggcag agggaggaag gacagcagac cagacagtca cagcagcctt gacaaaacgt 
61 tcctggaact caagctcttc tccacagagg aggacagagc agacagcaga gaccatggag 
121 tctccctcgg cccctcccca cagatggtgc atcccctggc agaggctcct gctcacagcc 
181 tcacttctaa ccttctggaa cccgcccacc actgccaagc tcactattga atccacgccg 
241 ttcaatgtcg cagaggggaa ggaggtgctt ctacttgtcc acaatctgcc ccagcatctt 
301 tttggctaca gctggtacaa aggtgaaaga gtggatggca accgtcaaat tataggatat 
3 61 gtaataggaa ctcaacaagc taccccaggg cccgcataca gtggtcgaga gataatatac 
421 cccaatgcat ccctgctgat ccagaacatc atccagaatg acacaggatt ctacacccta 
481 cacgtcataa agtcagatct tgtgaatgaa gaagcaactg gccagttccg ggtatacccg 
541 gagctgccca agccctccat ctccagcaac aactccaaac ccgtggagga caaggatgct 
601 gtggccttca cctgtgaacc tgagactcag gacgcaacct acctgtggtg ggtaaacaat 
661 cagagcctcc cggtcagtcc caggctgcag ctgtccaatg gcaacaggac cctcactcta 
721 ttcaatgtca caagaaatga cacagcaagc tacaaatgtg aaacccagaa cccagtgagt 
781 gccaggcgca gtgattcagt catcctgaat gtcctctatg gcccggatgc ccccaccatt 
841 tcccctctaa acacatctta cagatcaggg gaaaatctga acctctcctg ccacgcagcc 
901 tctaacccac ctgcacagta ctcttggttt gtcaatggga ctttccagca atccacccaa 
961 gagctcttta tccccaacat cactgtgaat aatagtggat cctatacgtg ccaagcccat 
1021 aactcagaca ctggcctcaa taggaccaca gtcacgacga tcacagtcta tgcagagcca 
1081 cccaaaccct tcatcaccag caacaactcc aaccccgtgg aggatgagga tgctgtagcc 
1141 ttaacctgtg aacctgagat tcagaacaca acctacctgt ggtgggtaaa taatcagagc 
1201 ctcccggtca gtcccaggct gcagctgtcc aatgacaaca ggaccctcac tctactcagt 
1261 gtcacaagga atgatgtagg accctatgag tgtggaatcc agaacgaatt aagtgttgac 
1321 cacagcgacc cagtcatcct gaatgtcctc tatggcccag acgaccccac catttccccc 
13 81 tcatacacct attaccgtcc aggggtgaac ctcagcctct cctgccatgc agcctctaac 
1441 ccacctgcac agtattcttg gctgattgat gggaacatcc agcaacacac acaagagctc 

15 01 tttatctcca acatcactga gaagaacagc ggactctata cctgccaggc caataactca 
1561 gccagtggcc acagcaggac tacagtcaag acaatcacag tctctgcgga gctgcccaag 
1621 ccctccatct ccagcaacaa ctccaaaccc gtggaggaca aggatgctgt ggccttcacc 

16 81 tgtgaacctg aggctcagaa cacaacctac ctgtggtggg taaatggtca gagcctccca 
1741 gtcagtccca ggctgcagct gtccaatggc aacaggaccc tcactctatt caatgtcaca 
1801 agaaatgacg caagagccta tgtatgtgga atccagaact cagtgagtgc aaaccgcagt 
1861 gacccagtca ccctggatgt cctctatggg ccggacaccc ccatcatttc ccccccagac 
1921 tcgtcttacc tttcgggagc gaacctcaac ctctcctgcc actcggcctc taacccatcc 
1981 ccgcagtatt cttggcgtat caatgggata ccgcagcaac acacacaagt tctctttatc 
2 041 gccaaaatca cgccaaataa taacgggacc tatgcctgtt ttgtctctaa cttggctact 
2101 ggccgcaata attccatagt caagagcatc acagtctctg catctggaac ttctcctggt 
2161 ctctcagctg gggccactgt cggcatcatg attggagtgc tggttggggt tgctctgata 
2221 tagcagccct ggtgtagttt cttcatttca ggaagactga cagttgtttt gcttcttcct 
22 81 taaagcattt gcaacagcta cagtctaaaa ttgcttcttt accaaggata tttacagaaa 
2341 agactctgac cagagatcga gaccatccta gccaacatcg tgaaacccca tctctactaa 
2401 aaatacaaaa atgagctggg cttggtggcg cgcacctgta gtcccagtta ctcgggaggc 
2461 tgaggcagga gaatcgcttg aacccgggag gtggagattg cagtgagccc agatcgcacc 
2 521 actgcact cc agtctggcaa cagagcaaga ctccatctca aaaagaaaag aaaagaagac 
2581 tctgacctgt actcttgaat acaagtttct gataccactg cactgtctga gaatttccaa 
2641 aactttaatg aactaactga cagcttcatg aaactgtcca ccaagatcaa gcagagaaaa 
2 701 taattaattt catgggacta aatgaactaa tgaggattgc tgattcttta aatgtcttgt 
2 761 ttcccagatt tcaggaaact ttttttcttt taagctatcc actcttacag caatttgata 
2821 aaatatactt ttgtgaacaa aaattgagac atttacattt tctccctatg tggtcgctcc 
2881 agacttggga aactattcat gaatatttat attgtatggt aatatagtta ttgcacaagt 
2 941 tcaataaaaa tctgctcttt gtataacaga aaaa (SEQ ID NO: 73) 
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CEA (NM__004363) 
MESPSAPPHRWCIPWQRLLLTASLLTFWNPPTTAKLTIESTPFN 

VAEGKEVLLLVHNLPQHLFGYSWYKGERVDGNRQI I GYVI GTQQATPGPAYSGRE 1 1 Y 
PNASLLIQNIIQNDTGFYTLHVIKSDLVNEEATGQFRVYPELPKPSlSSNlSrSKPVEDK 
DAVAFTCEPETQDATYLWWVlSrNQSIiPVSPRLQLSNGNRTLTLFNVTRNDTASYKCETQ 
NPVSARRSDSVILNVLYGPDAPTISPLNTSYRSGENLNLSCHAASNPPAQYSWFVNGT 
FQQSTQELFIPNITVNNSGSYTCQAHNSDTGLNRTTVTTITVYAEPPKPFITSKNSNP 
VEDEDAVALTCEPEIQNTTYLWWVNNQSLPVSPRLQLSNDNRTLTLLSVTRNDVGPYE 
CGIQNELSVDHSDPVILlWLYGPDDPTISPSYTYYRPGVlsrLSLSCHAASNPPAQYSWL 
IDGNIQQHTQELFISNITEKnSTSGLYTCQANNSASGHSRTTVKTITVSAELPKPSISSN 
NSKPVEDKI)AVAFTCEPEAQNTTYLWWWGQSLPVSPRLQLSNGNRTLTLFNVTRNDA 
RAYVCGIQNSVSANRSDPVTLDVLYGPDTPIISPPDSSYLSGANLNLSCHSASNPSPQ 
YSWRINGIPQQHTQVLFIAKITPN1OTGTYACFVSNLATGRNNSIVKSITVSASGTSPG 
L S AGAT VG I M I GVL VGVAL I (SEQ ID NO: 74) 
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63/1 1 5 

NCA (NM_002483) 

1 ctcctctaca aagaggtgga cagagaagac agcagagacc atgggacccc cctcagcccc 
61 tccctgcaga ttgcatgtcc cctggaagga ggtcctgctc acagcctcac ttctaacctt 
121 ctggaaccca cccaccactg ccaagctcac tattgaatcc acgccattca atgtcgcaga 
181 ggggaaggag gttcttctac tcgcccacaa cctgccccag aatcgtattg gttacagctg 
241 gtacaaaggc gaaagagtgg atggcaacag tctaattgta ggatatgtaa taggaactca 
3 01 acaagctacc ccagggcccg catacagtgg tcgagagaca atatacccca atgcatccct 
361 gctgatccag aacgtcaccc agaatgacac aggattctat accctacaag tcataaagtc 
421 agatcttgtg aatgaagaag caaccggaca gttccatgta tacccggagc tgcccaagcc 
481 ctccatctcc agcaacaact ccaaccccgt ggaggacaag gatgctgtgg ccttcacctg 
541 tgaacctgag gttcagaaca caacctacct gtggtgggta aatggtcaga gcctcccggt 
601 cagtcccagg ctgcagctgt ccaatggcaa catgaccctc actctactca gcgtcaaaag 
661 gaacgatgca ggatcctatg aatgtgaaat acagaaccca gcgagtgcca accgcagtga 
721 cccagtcacc ctgaatgtcc tctatggccc agatgtcccc accatttccc cctcaaaggc 
7 81 caattaccgt ccaggggaaa atctgaacct ctcctgccac gcagcctcta acccacctgc 
841 acagtactct tggtttatca atgggacgtt ccagcaatcc acacaagagc tctttatccc 
901 caacatcact gtgaataata gcggatccta tatgtgccaa gcccataact cagccactgg 
961 cctcaatagg accacagtca cgatgatcac agtctctgga agtgctcctg tcctctcagc 
1021 tgtggccacc gtcggcatca cgattggagt gctggccagg gtggctctga tatagcagcc 
1081 ctggtgtatt ttcgatattt caggaagact ggcagattgg accagaccct gaattcttct 
1141 agctcctcca atcccatttt atcccatgga accactaaaa acaaggtctg ctctgctcct 
12 01 gaagccctat atgctggaga tggacaactc aatgaaaatt taaagggaaa accctcaggc 
1261 ctgaggtgtg tgccactcag agacttcacc taactagaga cagtcaaact gcaaaccatg 
1321 gtgagaaatt gacgacttca cactatggac agcttttccc aagatgtcaa aacaagactc 
1381 ctcatcatga taaggctctt accccctttt aatttgtcct tgcttatgcc tgcctctttc 
1441 gcttggcagg atgatgctgt cattagtatt tcacaagaag tagcttcaga gggtaactta 
1501 acagagtgtc agatctatct tgtcaatccc aacgttttac ataaaataag agatccttta 
1561 gtgcacccag tgactgacat tagcagcatc tttaacacag ccgtgtgttc aaatgtacag 
1621 tggtcctttt cagagttgga cttctagact cacctgttct cactccctgt tttaattcaa 
1681 cccagccatg caatgccaaa taatagaatt gctcccta.cc agctgaacag ggaggagtct 
1741 gtgcagtttc tgacacttgt tgttgaacat ggctaaatac aatgggtatc gctgagacta 
1801 agttgtagaa attaacaaat gtgctgcttg gttaaaatgg ctacactcat ctgactcatt 
1861 ctttattcta ttttagttgg tttgtatctt gcctaaggtg cgtagtccaa ctcttggtat 
1921 taccctccta atagtcatac tagtagtcat actccctggt gtagtgtatt ctctaaaagc 
19 81 tttaaatgtc tgcatgcagc cagccatcaa atagtgaatg gtctctcttt ggctggaatt 
2041 acaaaactca gagaaatgtg tcatcaggag aacatcataa cccatgaagg ataaaagccc 
2101 caaatggtgg taactgataa tagcactaat gctttaagat ttggtcacac tctcacctag 
2161 gtgagcgcat tgagccagtg gtgctaaatg ctacatactc caactgaaat gttaaggaag 
2221 aagatagatc caaaaaaaaa aaaaaaaaa (SEQ ID NO: 75) 
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64/115 

NCA (NM_002483) 
MGP PS AP P CRLH VP WKE VLLTASIf LTFWNP P TTAKLT I E S T P FN 

VAEGKEVLLLAHNLPQNRIGYSWYKGERVDGNSlilVGYVIGTQQATPGPAYSGRETIY 
PNASLLIQNVTQNDTGFYTLQVIKSDLVNEEATGQFHVYPELPKPSISSNNSNPVEDK 
DAVAFTCEPEVQNTTYLWWVNGQSLPVSPRLQLSNGNMTLTLLSVKRNDAGSYECEIQ 
NPAS ANRS D P VTLNVL YG PDVPT I S P S KAN YR PGENLNL S CHAASNP PAQ YS WF I NGT 
FQQSTQELFI PHITVTMSGS YMCQAHNSATGLNRTTVTMITVSGSAPVLSAVATVGI T 
IGVLARVALI (SEQ ID NO: 76) 
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65/1 1 5 

Follistatin (NM_006350) 

1 gctcctcgcc ccgcgcctgc ccccaggatg gtccgcgcga ggcaccagcc gggtgggctt 
61 tgcctcctgc tgctgctgct ctgccagttc atggaggacc gcagtgccca ggctgggaac 
121 tgctggctcc gtcaagcgaa gaacggccgc tgccaggtcc tgtacaagac cgaactgagc 
181 aaggaggagt gctgcagcac cggccggctg agcacctcgt ggaccgagga ggacgtgaat 
241 gacaacacac tcttcaagtg gatgattttc aacgggggcg cccccaactg catcccctgt 
3 01 aaagaaacgt gtgagaacgt ggactgtgga cctgggaaaa aatgccgaat gaacaagaag 

3 61 aacaaacccc gctgcgtctg cgccccggat tgttccaaca tcacctggaa gggtccagtc 
421 tgcgggctgg atgggaaaac ctaccgcaat gaatgtgcac tcctaaaggc aagatgtaaa 

4 81 gagcagccag aactggaagt ccagtaccaa ggcagatgta aaaagacttg tcgggatgtt 
541 ttctgtccag gcagctccac atgtgtggtg gaccagacca ataatgccta ctgtgtgacc 
601 tgtaatcgga tttgcccaga gcctgcttcc tctgagcaat atctctgtgg gaatgatgga 
661 gtcacctact ccagtgcctg ccacctgaga aaggctacct gcctgctggg cagatctatt 
721 ggattagcct atgagggaaa gtgtatcaaa gcaaagtcct gtgaagatat ccagtgcact 
781 ggtgggaaaa aatgtttatg ggatttcaag gttgggagag gccggtgttc cctctgtgat 
841 gagctgtgcc ctgacagtaa gtcggatgag cctgtctgtg ccagtgacaa tgccacttat 
901 gccagcgagt gtgccatgaa ggaagctgcc tgctcctcag gtgtgctact ggaagtaaag 
961 cactccggat cttgcaactg aatctgcccg taaaacctga gccattgatt cttcagaact 

1021 ttctgcagtt tttgacttca tagattatgc tttaaaaaat tttttttaac ttattgcata 
10 81 acagcagatg ccaaaaacaa aaaaagcatc tcactgcaag tcacataaaa atgcaacgct 
1141 gtaatatggc tgtatcagag ggctttgaaa acatacactg agctgcttct gcgctgttgt 
1201 tgtccgtatt taaacaacag ctcccctgta ttcccccatc tagccatttc ggaagacacc 
1261 gaggaagagg aggaagatga agaccaggac tacagctttc ctatatcttc tattctagag 
1321 tggtaaactc tctataagtg ttcagtgttc acatagcctt tgtgcaaaaa aaaaaaaaaa 
1381 aaaaaa (SEQ ID NO: 77) 
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66/1 1 5 

Follistatin (NM_006350) 
MVRARHQPGGLCLLLLLLCQFMEDRSAQAGNCWLRQAKNGRCQV 

LYKTELSKEECCSTGRLSTSWTEEDVNDNTLFKWMIFNGGAPNCIPCKETCENVDCGP 
GKKCRMNKKNKPRCVCAPDCSNITWKGPVCGLDGKTYRNECALLKARCKEQPELEVQY 
QGRCKKTCRDVFCPGSSTCWDQTNNAYCVTCNRICPEPASSEQYLCGNDGVTYSSAC 
HLRKATCLLGRSIGLAYEGKCIKAKSCEDIQCTGGKKCLWDFKVGRGRCSIjCDELCPD 
SKSDEPVCASDNATYASECAMKEAACSSGVLLEVKHSGSCN (SEQ ID NO: 78) 
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67/1 1 5 



Claudin 1 (NM__ 
1 gagcaaccgc agcttctagt atccagactc 
61 cgacccagag cttctccagc ggcggcgcag 
121 gcggggccca gccaccttcg ggagtccggg 
181 acctgccacc cctgagccag cgcgggcgcc 
241 gctgttgggc ttcattctcg ccttcctggg 
3 01 gccccagtgg aggatttact cctatgccgg 
361 cgaggggctg tggatgtcct gcgtgtcgca 
421 tgactccttg ctgaatctga gcagcacatt 
481 catcctcctg ggagtgatag caatctttgt 
541 cttggaagac gatgaggtgc agaagatgag 
601 tcttgcaggt ctggctattt tagttgccac 
661 attctatgac cctatgaccc cagtcaatgc 
721 tggctgggct gctgcttctc tctgccttct 
781 ccgaaaaaca acctcttacc caacaccaag 
841 gaaagactac gtgtgacaca gaggcaaaag 
9 01 ggacattgag atactatcat taacattagg 
961 gtatggtatt acaaaacaaa caaacaaaca 
1021 aaacatggct taatcttatt ttatcttctt 
1081 ttgtattact gcttcccatt gagtaatcat 
1141 tatatataga tatgtatata tacatgtttt 
12 01 ctcattatgt tgatactagc atacttaaaa 

12 61 ccatattgat gaagatgttt attggtatat 
1321 cagtcaaata tcatttactc ttcttcatta 

13 81 ctaatttacc aaggatgaat tctttcaatt 
1441 ttatttttta ccataatctt atagcacttg 
1501 tttcattggt ctctatctcc tgaatctaac 
1561 agccaagaag aatttattac aaatcagaac 
1621 gtgataaatt cctgttgacc ttcccacaca 
1681 tttgctttga aaatatttgt ccaattgagt 
1741 cacaacttta ttgattgaat ttttaagcta 
1801 acctttttgt tccccattcc ttaattgtat 
1861 tatatcttcc taataaggtg tggtctgttt 
1921 gataatctgg tgacaaatat tctctctgta 
1981 tcttttttct atctgccaaa ttgagataat 
2041 aatattaatt agtttatatt actctcattc 
2101 tttatttgct cagctggctg agacactgaa 
2161 cttcatgtga ttcactgcct tcctctctct 
2221 acacatacct tcatgtggtt cagtgccttc 
22 81 aaacctacgc acataccttc atgtggctca 
2341 attctttcag ctgtgtctga catgtttgtg 
2401 tttccagtct gfcacagaatg ctatttcact 
2461 ggcattggtg tctggagacc tggatttgag 
2521 gagcaaggca tttggctgct gtaagcttat 
2581 cctgatcttc ccacctcaca gtgatgttgt 
2641 tgtggttttg taatttaaaa agtgctatac 
270T acgttttggt gttgcttttc aaatgtttga 
2761 ttgccttaac cagtctctca agtgatgaga 
2821 ataagattct gaggaagtct tatcttctgc 
2881 aaacagatgt aatgggaaga aataaaagcc 

2 941 atttttgaat cataataact cataaggtgc 

3 0 01 gctgttagct ggcagctgac gctgctagga 
3 0 61 aactacacaa ggaaagtcag ccactgtgtc 
3121 tgtgccttcc aaacctgaga atatatgctt 
3181 acatacatag atcttcatga tgtgtgagtg 
3241 ttacaaaaaa attttatggc ccaaaatgac 
3301 attttgatct ttttatattc ttctaccaca 
33 61 ttttataata ggaatttgta taaagcatta 
3421 aaaaaaggaa aaaaaaaaaa aaaaa (SEQ 



021101) 
cagcgccgcc 
cgagcagggc 
ttgcccacct 
cgagcgagtc 
atggatcggc 
cgacaacatc 
gagcaccggg 
gcaagcaacc 
ggccaccgtt 
gatggctgtc 
agcatggtat 
caggtacgaa 

gggaggtgcc 

gccctatcca 
gagaaaatca 
accttagaat 
aaaaacccat 
tcctcaatat 
actcaattgg 
tctattaaaa 
tatctctaaa 
tttctttttc 
gctttgggtg 
cttcatgcgt 
catcgttatt 
acatttcata 
tttggaggca 
atccctgtac 
agctgcatgc 
cttattcata 
tgttttccca 
gtctgaacaa 
gctgtaagca 
gatacttaac 
tttgaacatg 
gaagtcactg 
accagtctat 
ctctctctac 
gtgccttcct 
ctctgttcca 
tgagcaagat 
tcttggtgct 
tgcttcatct 

ggggatccag 

taagggaaag 
aaacaaaaaa 
cagtgaagta 
agtgagtatg 
tacgtgttgg 
tatctgttca 
tagttagttt 
ttatgaggaa 
ttggaagtta 
taattccatg 
caacgaaatt 
cctggaaaca 
ctctttttca 
ID NO: 79) 



ccgggcgcgg 

tccccgcctt 
gcaaactctc 
atggccaacg 
gccatcgtca 
gtgaccgccc 
cagatccagt 
cgtgccttga 
ggcatgaagt 
attgggggtg 
ggcaatagaa 
tttggtcagg 
ctactttgct 
aaacctgcac 
tgttgaaaca 
tttgggtatt 
gtgttaaaat 
aggagggaag 
gggaaggggt 
atagacagta 
ataggtaaat 
gtctatatat 
cctttgccac 
gcccttttca 
aagcccttat 
gcctacattt 
aatctttctg 
tctgacccat 
tgttccccca 
gttttatatc 
agtgtaatta 
agtgctagac 
agtcacttaa 
cagttagaag 
aactatgcct 
aacaaaacct 
ttccactgaa 
cagtctattt 
ctctctacca 
ttttaacaac 
gatgtaatgg 
atcaatcacc 
gtaagcggtg 
tgagatagaa 
aattgaggaa 
aatgttaaga 
aaattgagtg 
gcccgatgct 
taaatccaac 
gtgatgccct 
ggaaatggta 
ttggacctaa 
aaatttaaat 
tggatatcag 
gttacaatag 
gaccaataga 
ataaattgtt 



acccca.a.ccc 
aacttcctcc 
cgccttctgc 
cggggctgca 
gcactgccct 
aggccatgta 
gcaaagtctt 
tggtggttgg 
gtatgaagtg 
cgatatttct 
tcgttcaaga 
ctctcttcac 
gttcctgtcc 
cttccagcgg 
aaccgaaaat 
gtaatctgaa 
actcagtgct 
atttttccat 
gctccttaaa 
aaatactatt 
gtatttaatt 
acatatgtaa 
aagacctagc 
tatacttatt 
ttgttttgtg 
tagtttctaa 
catgaccaaa 
agcactcttg 
ggtgttgtaa 
cccctaaact 
tcatgcgttt 
tttctggagt 
tctttctacc 
aggtagtgtg 
atgtagtgtc 
acacacgtac 
caaaacctac 
ccactgaaca 
gtctatttcc 
tgctcttact 
aaagggtgtt 
gtctgtgttt 
gtttgtaatt 
tacatgtaag 
ttaactgcat 
aatgggtttc 
cactaaacaa 
ttctgtggct 
agcaagggag 
cagagctctt 
cttcataata 
taaattttag 
ggcttttgcc 
ttaccaaaca 
aatttatcca 
cattttgggg 
ttttaattta 
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68/1 1 5 

Claudin 1 (NM_021101) 
MANAGLQLIiGF I LAFLGWI GAI VS TAL PQWR I YS YAGDNI VTAQ 

AMYEGLWMSCVSQSTGQ IQCKVFDSLLNLS S TLQATRALMWGI LLGVI AI FVATVGM 
KCMKCLEDDEVQKMRMAVI GGAI FLLAGLAILVATAWYGNRIVQEFYDPMTPVNARYE 
FGQALFTGWAAASLCLLGGALLCCSCPRKTTSYPTPRPYPKPAPSSGKDYV (SEQ ID NO: 80) 
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69/1 1 5 



Claudin 14 (NM 012130) 



1 gtttgcttca ccttctgcca ggattgtaag tttcctgagg cctccccagt cctgcggaac 

61 tggctccggc tggcacctga ggagcggcgt gaccccgagg gcccagggag ctgcccggct 

121 ggcctaggca ggcagccgca ccatggccag cacggccgtg cagcttctgg gcttcctgct 

181 cagcttcctg ggcatggtgg gcacgttgat caccaccatc ctgccgcact ggcggaggac 

241 agcgcacgtg ggcaccaaca tcctcacggc cgtgtcctac ctgaaagggc tctggatgga 

3 01 gtgtgtgtgg cacagcacag gcatctacca gtgccagatc taccgatccc tgctggcgct 

3 61 gccccaagac ctccaggctg cccgcgccct catggtcatc tcctgcctgc tctcgggcat 

421 agcctgcgcc tgcgccgtca tcgggatgaa gtgcacgcgc tgcgccaagg gcacacccgc 

481 caagaccacc tttgccatcc tcggcggcac cctcttcatc ctggccggcc tcctgtgcat 

541 ggtggccgtc tcctggacca ccaacgacgt ggtgcagaac ttctacaacc cgctgctgcc 

601 cagcggcatg aagtttgaga ttggccaggc cctgtacctg ggcttcatct cctcgtccct 

661 ctcgctcatt ggtggcaccc tgctttgcct gtcctgccag gacgaggcac cctacaggcc 

721 ctaccaggcc ccgcccaggg ccaccacgac cactgcaaac accgcacctg cctaccagcc 

781 accagctgcc tacaaagaca atcgggcccc ctcagtgacc tcggccacgc acagcgggta 

841 caggctgaac gactacgtgt gagtccccac agcctgcttc tcccctgggc tgctgtgggc 

901 tgggtccccg gcgggactgt caatggaggc aggggttcca gcacaaagtt tacttctggg 

961 caatttttgt atccaaggaa ataatgtgaa tgcgaggaaa tgtctttaga gcacagggac 

1021 agagggggaa ataagaggag gagaaagctc tctataccaa agactgaaaa aaaaaatcct 

1081 gtctgttttt gtatttatta tatatattta tgtgggtgat ttgataacaa gtttaatata 

1141 aagtgacttg ggagtttggt cagtggggtt ggtttgtgat ccaggaataa accttgcgga 

12 01 tgtggctgtt tatgaaaaaa aaaaaaaaaa aaa (SEQ ID NO: 81) 
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70/1 1 5 

Claudin 14 (NM__012130) 
MASTAVQLLGFLLSFLGMVGTLITTILPHWRRTAHVGTNILTAV 

SYLKGLWMECWHSTGIYQCQIYRSLLALPQDLQAARALMVISCLLSGIACACAVIGM 
KCTRCAKGTPAKTTFAILGGTLFILAGLLCMVAVSWTTNDWQNFYISrPLLPSGMKFEI 
GQALYLGFISSSLSLIGGTLLCLSCQDEAPYRPYQAPPRATTTTA3STTAPAYQPPAAYK 
DNRAPSVTSATHSGYRLNDYV (SEQ ID NO: 82) 
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71/115 

Tenascin-R (NM_0 03 2 85) 
1 ccttggtttc cgttgcagat tcccacaact ccatgctgtg tgctgcaggc tggtcctgaa 
61 cccagatctc tggctgagag gatgggggca gatggggaaa cagtggttct gaagaacatg 
121 ctcattggcg tcaacctgat ccttctgggc tccatgatca agccttcaga gtgtcagctg 
181 gaggt caeca cagaaagggt ccagagacag tcagtggagg aggagggagg cattgccaac 
241 tacaacacgt ccagcaaaga gcagcctgtg gtcttcaacc acgtgtacaa cattaaegtg 
3 01 cccttggaca acctctgctc ctcagggcta gaggectctg ctgagcagga ggtgagtgca 

3 61 gaagacgaga ctctggcaga gtacatgggc cagacctcag accacgagag ccaggtcacc 
421 tttacacaca ggatcaactt ccccaaaaag gcctgtccat gtgccagttc ageccaggtg 

4 81 ctgeaggage tgctgagccg gatcgagatg ctggagaggg aggtgtcggt getgegagae 
541 cagtgcaacg ccaactgctg ccaagaaagt gctgccacag gacaactgga ctatatccct 
601 cactgeagtg gccacggcaa ctttagcttt gagtcctgtg getgeatctg caacgaaggc 
661 tggtttggca agaattgetc ggagccctac tgcccgctgg gttgctccag ccggggggtg 
721 tgtgtggatg gecagtgeat ctgtgacagc gaatacagcg gggatgactg ttccgaactc 
781 cggtgcccaa cagactgeag ctcccggggg ctctgcgtgg aeggggagtg tgtctgtgaa 
841 gagccctaca ctggcgagga ctgcagggaa ctgaggtgcc ctggggactg ttcggggaag 
901 gggagatgtg ccaacggtac ctgtttatgc gaggagggct acgttggtga ggactgegge 
961 cageggcagt gtctgaatgc ctgcagtggg cgaggacaat gtgaggaggg gctctgcgtc 

1021 tgtgaagagg gctaccaggg ccctgactgc tcagcagttg cccctccaga ggacttgega 
10 81 gtggctggta teagegacag gtccattgag ctggaatggg aegggecgat ggcagtgacg 
1141 gaatatgtga tctcttacca gccgacggcc ctggggggcc tccagctcca geagegggtg 
12 01 cctggagatt ggagtggtgt caccatcacg gagctggagc caggtctcac ctacaacatc 

12 61 agegtctacg ctgtcattag caacatcctc agccttccca tcactgccaa ggtggccacc 
1321 catctctcca ctcctcaagg gctacaattt aagacgatca cagagaccac cgtggaggtg 

13 81 cagtgggagc ccttctcatt ttccttcgat gggtgggaaa tcagcttcat tccaaagaac 
1441 aatgaagggg gagtgattgc tcaggtcccc agcgatgtta cgtcctttaa ccagacagga 
1501 etaaagectg gggaggaata cattgtcaat gtggtggctc tgaaagaaca ggcccgcagc 
1561 ccccctacct cggccagcgt ctccacagtc attgaeggee ccacgcagat cctggttcgc 
1621 gatgtctegg acaccgtggc ttttgtggag tggattcccc ctcgagccaa agtcgatttc 
1681 attcttttga aatatggcct ggtgggcggg gaaggtggga ggaccacctt ccggctgcag 
1741 cctcccctga gccaatactc agtgcaggcc ctgcggcctg gctcccgata cgaggtgtca 
1801 gtcagtgccg tccgagggac caacgagagc gattctgeca ccactcagtt cacaacagag 
1861 atcgatgccc ccaagaactt gcgagttggt tctcgcacag caaccagcct tgacctcgag 
1921 tgggataaca gtgaagcega agttcaggag tacaaggttg tgtacagcac cctggcgggt 
1981 gagcaatatc atgaggtact ggtccccagg ggcattggtc caaccaccag ggccaccctg 
2 041 acagatctgg tacctggcac tgagtatgga gttggaatat ctgccgtcat gaactcacag 
2101 caaagcgtgc cagccaccat gaatgecagg actgaacttg acagtccccg agacctcatg 
2161 gtgacagect cctcggagac ctccatctcc ctcatctgga ccaaggccag tggccccatt 
2221 gaccactacc gaattacctt taccccatcc tctgggattg cctcagaagt caccgtaccc 
2 2 81 aaggacagga cctcatacac actaacagat etagagectg gggcagagta catcatttcc 
2341 gtcactgctg agaggggtcg gcagcagagc ttggagtcca ctgtggatgc tttcacaggc 
2401 ttccgtccca tctctcatct gcacttttct catgtgacct cctccagtgt gaacatcact 
24 61 tggagtgatc catctccccc agcagacaga ctcattctta actacagccc cagggatgag 
2 521 gaggaagaga tgatggaggt ctccctggat gccaccaaga ggcatgctgt cctgatgggc 
2581 ctgcaaccag ccacagagta tattgtgaac cttgtggctg tccatggcac agtgacctct 
2 641 gageccattg tgggctccat caccacagga attgatcccc caaaagacat cacaattagc 
2 701 aatgtgacca aggactcagt gatggtctcc tggagccctc ctgttgcatc tttcgattac 
2761 taccgagtat catatcgacc cacccaagtg ggacgactag acagctcagt ggtgcccaac 
2 821 actgtgacag aattcaccat caccagactg aacccagcta ccgaatacga aatcagcctc 
2 8 81 aacagcgtgc ggggcaggga ggaaagegag cgcatctgta ctcttgtgca cacagccatg 

2 941 gacaaccctg tggatctgat tgctaccaat atcactccaa cagaagccct gctgcagtgg 

3 0 01 aaggcaccag tgggtgaggt ggagaactac gtcattgttc ttacacactt tgeagtcget 
3 0 61 ggagagacca tccttgttga eggagtcagt gaggaatttc ggcttgttga cctgcttcct 
3121 agcacccact atactgccac catgtatgee accaatggac ctctcaccag tggcaccatc 
3181 agcaccaact tttctactct cctggaccct ccggcaaacc tgacagccag tgaagtcacc 
3241 agacaaagtg ccctgatctc ctggcagcct cccagggcag agattgaaaa ttatgtcttg 
3301 acctacaaat ccaccgacgg aagccgcaag gagctgattg tggatgeaga agacacctgg 
33 61 attcgactgg agggcctgtt ggagaacaca gaetacaegg tgctcctgca ggcagcacag 
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3421 gacaccacgt ggagcagcat cacctccacc gctttcacca caggaggccg ggtgttccct 
3481 catccccaag actgtgccca gcatttgatg aatggagaca ctttgagtgg ggtttacccc 
3541 atcttcctca atggggagct gagccagaaa ttacaagtgt actgtgatat gaccaccgac 
3 601 gggggcggct ggattgtatt ccagaggcgg cagaatggcc aaactgattt tttccggaaa 
3 661 tgggctgatt accgtgttgg cttcgggaac gtggaggatg agttctggct ggggctggac 
3 721 aatatacaca ggatcacatc ccagggccgc tatgagctgc gcgtggacat gcgggatggc 
3781 caggaggccg ccttcgcctc ctacgacagg ttctctgtcg aggacagcag aaacctgtac 
3 841 aaactccgca taggaagcta caacggcact gcgggggact ccctcagcta tcatcaagga 
3 901 cgccctttct ccacagagga tagagacaat gatgttgcag tgactaactg tgccatgtcg 

3 961 tacaagggag catggtggta taagaactgc caccggacca acctcaatgg gaagtacggg 

4 021 gagtccaggc acagtcaggg catcaactgg taccattgga aaggccatga gttctccatc 
40 81 ccctttgtgg aaatgaagat gcgcccctac aaccaccgtc tcatggcagg gagaaaacgg 
4141 cagtccttac agttctgagc agtgggcggc tgcaagccaa ccaatatttt ctgtcatttg 
4201 tttgfcatttt ataatatgaa acaagggggg agggtaatag caatgtgttt tgcaacatat 

42 61 taagagtatg tgaaggaagc agggatgtcg caggaatccg ctggctaaca tctgctcttg 
4321 gtttctgctg ccctggagcc tgaccctcag tctccattct ccctcctacc caggcctcct 

43 81 caaccttcac ctcctttccc accaaggagg agaagtagga agttttctta aagggccaat 
4441 tcaaagccaa gtcgtggggt gcagattgtt atggtgacag gcacacacat ttttctaccc 
4501 ttcttctgag atgtcctctg ccttccaggt atttgtgatt ttgtcacagc ctgacatggc 
4561 caggttctca cactggccca gagaaaagag cctcagcaag agagttttgc caacaattcc 
4621 ccttaaaagg aaacagatca actacaccgc atcccaacaa cccaggttct tttccttcct 
4681 tccttccttc ctcccttcct tctttcctgc cttccc (SEQ ID NO: 83) 
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Tenascin-R (NM_0032 85) 
MGADGETWLKNMLIGVNLILLGSMIKPSECQLEVTTERVQRQS 

VEEEGGIANYNTSSKEQPWFNHVYNIIWPLDNLCSSGLEASAEQEVSAEDETLAEYM 
GQTSDHESQVTFTHRINFPKKACPCASSAQVLQELLSRIEMLEREVSVLRDQCNANCC 
QESAATGQLDYIPHCSGHGNFSFESCGCICNEGWFGKNCSEPYCPLGCSSRGVCVDGQ 
CICDSEYSGDDCSELRCPTDCSSRGLCVDGECVCEEPYTGEDCRELRCPGDCSGKGRC 
ANGTCLCEEGYVGEDCGQRQCLNACSGRGQCEEGLCVCEEGYQGPDCSAVAPPEDLRV 
AGISDRSIELEWDGPMAVTEYVISYQPTALGGLQL.QQRVPGDWSGVTITELEPGLTYN 
ISVYAVISNILSLPITAKVATHLSTPQGLQFKTITETTVEVQWEPFSFSFDGWEISFI 
PIOSnSTEGGVIAQVPSDVTSFNQTC^ 

QILVRDVSDTVAFVEWIPPRAKVDFIIiLKYGLVGGEGGRTTFRLQPPLSQYSVQALRP 
GSRYEVSVSAVRGTNESDSATTQFTTEIDAPKNLRVGSRTATSLDLEWDNSEAEVQEY 
KVVYSTLAGEQYHEVLVPRGIGPTTRATLTDLVPGTEYGVGISAVMNSQQSVPATMNA 
RTELDSPRDLMVTASSETSISLIWTKASGPIDHYRITFTPSSGIASEVTVPKDRTSYT 
LTDIiEPGAEYIISVTAERGRQQSLESTVDAFTGFRPISHLHFSHVTSSSVNITWSDPS 
PPADRLILNYSPRDEEEEMMEVSLDATICRHAVIjMGLQPATEYIVNLVAVHGTVTSEPI 
VGS I TTG I D P PKD I T I SNVTKDS VMVS WS P PVAS FDY YRVS YRPTQVGRLD SSWPNT 
VTEFT I TRLNPATE YE I SLNS VRGREES ER I CTLVHTAMDNPVDL I ATNI TPTEALLQ 
WKAPVGEVENYVIVLTHFAVAGETILVDGVSEEFRLVDLLPSTHYTATMYATNGPLTS 
GTISTNFSTLLDPPANLTASEVTRQSALISWQPPRAEIENYVLTYKSTDGSRKELIVD 
AEDTW I RLEGLLENTD YTVLLQAAQDTTWS SITS TAFTTGGRVFPHPQDCAQHLMNGD 
TliSGVYPIFLNGELSQKLQVYCDMTTDGGGWIVFQRRQNGQTDFFRKWADYRVGFGNV 
EDEFWLGLDNIHRITSQGRYELRVDMRDGQEAAFASYDRFSVEDSRNLYKLRIGSYNG 
TAGD SL S YHQGR P F S TEDRDND VAVTNCAMS YKGAWWYKNCHR TNLNGKYGE SRHS QG 
INWYHWKGHEFS I PFVEMKMRPYNHRliMAGRKRQSLQF (SEQ ID NO: 84) 
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CAD 3 (NM-001793) 

1 aaaggggcaa gagctgagcg gaacaccggc ccgccgtcgc ggcagctgct tcacccctct 
61 ctctgcagcc atggggctcc ctcgtggacc tctcgcgtct ctcctccttc tccaggtttg 
121 ctggctgcag tgcgcggcct ccgagccgtg ccgggcggtc ttcagggagg ctgaagtgac 
181 cttggaggcg ggaggcgcgg agcaggagcc cggccaggcg ctggggaaag tattcatggg 
241 ctgccctggg caagagccag ctctgtttag cactgataat gatgacttca ctgtgcggaa 
3 01 tggcgagaca gtccaggaaa gaaggtcact gaaggaaagg aatccattga agatcttccc 
3 61 atccaaacgt atcttacgaa gacacaagag agattgggtg gttgctccaa tatctgtccc 
421 tgaaaatggc aagggtccct tcccccagag actgaatcag ctcaagtcta ataaagatag 
481 agacaccaag attttctaca gcatcacggg gccgggggca gacagccccc ctgagggtgt 
541 cttcgctgta gagaaggaga caggctggtt gttgttgaat aagccactgg accgggagga 
6 01 gattgccaag tatgagctct ttggccacgc tgtgtcagag aatggtgcct cagtggagga 
661 ccccatgaac atctccatca tcgtgaccga ccagaatgac cacaagccca agtttaccca 
721 ggacaccttc cgagggagtg tcttagaggg agtcctacca ggtacttctg tgatgcaggt 
781 gacagccacg gatgaggatg atgccatcta cacctacaat ggggtggttg cttactccat 
841 ccatagccaa gaaccaaagg acccacacga cctcatgttc accattcacc ggagcacagg 
901 caccatcagc gtcatctcca gtggcctgga ccgggaaaaa gtccctgagt acacactgac 
961 catccaggcc acagacatgg atggggacgg ctccaccacc acggcagtgg cagtagtgga 
1021 gatccttgat gccaatgaca atgctcccat gtttgacccc cagaagtacg aggcccatgt 
1081 gcctgagaat gcagtgggcc atgaggtgca gaggctgacg gtcactgatc tggacgcccc 
1141 caactcacca gcgtggcgtg ccacctacct tatcatgggc ggtgacgacg gggaccattt 

12 01 ta.cca.tca.ee acccaccctg agagcaacca gggcatcctg acaaccagga agggtttgga 
1261 ttttgaggcc aaaaaccagc acaccctgta cgttgaagtg accaacgagg ccccttttgt 

13 21 gctgaagctc ccaacctcca cagccaccat agtggtccac gtggaggatg tgaatgaggc 
13 81 acctgtgttt gtcccaccct ccaaagtcgt tgaggtccag gagggcatcc ccactgggga 
1441 gcctgtgtgt gtctacactg cagaagaccc tgacaaggag aatcaaaaga tcagctaccg 
1501 catcctgaga gacccagcag ggtggctagc catggaccca gacagtgggc aggtcacagc 
1561 tgtgggcacc ctcgaccgtg aggatgagca gtttgtgagg aacaacatct atgaagtcat 
1621 ggtcttggcc atggacaatg gaagccctcc caccactggc acgggaaccc ttctgctaac 
1681 actgattgat gtcaatgacc atggcccagt ccctgagccc cgtcagatca ccatctgcaa 
1741 ccaaagccct gtgcgccagg tgctgaacat cacggacaag gacctgtctc cccacacctc 
1801 ccctttccag gcccagctca cagatgactc agacatctac tggacggcag aggtcaacga 
1861 ggaaggtgac acagtggtct tgtccctgaa gaagttcctg aagcaggata catatgacgt 
1921 gcacctttct ctgtctgacc atggcaacaa agagcagctg acggtgatca gggccactgt 
1981 gtgcgactgc catggccatg tcgaaacctg ccctggaccc tggaagggag gtttcatcct 
2 041 ccctgtgctg ggggctgtcc tggctctgct gttcctcctg ctggtgctgc ttttgttggt 
2101 gagaaagaag cggaagatca aggagcccct cctactccca gaagatgaca cccgtgacaa 
2161 cgtcttctac tatggcgaag aggggggtgg cgaagaggac caggactatg acatcaccca 
2221 gctccaccga ggtctggagg ccaggccgga ggtggttctc cgcaatgacg tggcaccaac 
22 81 catcatcccg acacccatgt accgtcctcg gccagccaac ccagatgaaa tcggcaactt 
2341 tataattgag aacctgaagg cggctaacac agaccccaca gccccgccct acgacaccct 
2401 cttggtgttc gactatgagg gcagcggctc cgacgccgcg tccctgagct ccctcacctc 
2461 ctccgcctcc gaccaagacc aagattacga ttatctgaac gagtggggca gccgcttcaa 
2521 gaagctggca gacatgtacg gtggcgggga ggacgactag gcggcctgcc tgcagggctg 
2581 gggaccaaac gtcaggccac agagcatctc caaggggtct cagttccccc ttcagctgag 
2 641 gacttcggag cttgtcagga agtggccgta gcaacttggc ggagacaggc tatgagtctg 
2 701 acgttagagt ggttgcttcc ttagcctttc aggatggagg aatgtgggca gtttgacttc 
2761 agcactgaaa acctctccac ctgggccagg gttgcctcag aggccaagtt tccagaagcc 
2 821 tcttacctgc cgtaaaatgc tcaaccctgt gtcctgggcc tgggcctgct gtgactgacc 
2881 tacagtggac tttctctctg gaatggaacc ttcttaggcc tcctggtgca acttaatttt 

2 941 tttttttaat gctatcttca aaacgttaga gaaagttctt caaaagtgca gcccagagct 

3 001 gctgggccca ctggccgtcc tgcatttctg gtttccagac cccaatgcct cccattcgga 
3 061 tggatctctg cgtttttata ctgagtgtgc ctaggttgcc ccttattttt tattttccct 
3121 gttgcgttgc tatagatgaa gggtgaggac aatcgtgtat atgtactaga acttttttat 
3181 taaagaaact tttcccagaa aaaaa (SEQ ID NO; 85) 
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CAD3 (NM-001793) 
MGLPRGPLASLLIiLQVCWLQCAASEPCRAVFREAEVTLEAGGAE 

QEPGQALGKVFMGCPGQEPALFSTDNDDFTVRNGETVQERRSLKERNPLKIFPSKRIL 
RRHKRDWWAP I SVPENGKGPFPQRLNQLKSNKDRDTKI FYS I TGPGADS PPEGVFAV 
EKETGWLLLNKPLDREEI AKYELFGHAVSENGASVEDPMNI S 1 1 VTDQNDHKPKFTQD 
TFRGSVLEGVLPGTSVMQVTATDEDDAIYTYNGWAYSIHSQEPKDPHDLMFTIHRST 
GTISVISSGLDREKVPEYTIiTIQATDMDGDGSTTTAVAWEILDANDNAPMFDPQKYE 
AHVPENAVGHEVQRLTVTDLDAPNSPAWRATYLIMGGDDGDHFTITTHPESNQGILTT 
RKGLDFE AKNQHTL YVE VTNEAPFVLKIiP T S TAT I WHVEDVNE APVF VP P S KVVEVQ 
EGIPTGEPVCVYTAEDPDKENQKISYRILRDPAGWLAMDPDSGQVTAVGTLDREDEQF 
VRNNI YE VMVLAMDNGS P PTTGTGTLLLTL I DVNDHGPVP EPRQITI CNQS PVRQVLN 
ITDKDLSPHTSPFQAQLTDDSDIYWTAEVNEEGDTWLSLKKFLKQDTYDVHLSLSDH 
GNKEQLTVIRATVCDCHGHVETCPGPWKGGFILPVLGAVLALLFLLLVLLLLVRKKRK 
I KEPLLLPEDDTRDNVFYYGEEGGGEEDQDYDITQLHRGLEARPEWLRNDVAPTI I P 
TPMYRPRPANPDEIGNFIIENLKAANTDPTAPPYDTLLVFDYEGSGSDAASLSSLTSS 
ASDQDQDYDYLNEWGSRFKKLADMYGGGEDD (SEQ ID NO: 86) 
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CONT (NM_001843) 

1 gctgtgccgc accgaggcga gcaggagcag ggaacaggtg tttaaaatta tccaactgcc 

61 atagagctaa attctttttt ggaaaattga accgaacttc tactgaatac aagatgaaaa 

121 tgtggttgct ggtcagtcat cttgtgataa tatctattac tacctgttta gcagagttta 

181 catggtatag aagatatggt catggagttt ctgaggaaga caaaggattt ggaccaattt 

241 ttgaagagca gccaatcaat accatttatc cagaggaatc actggaagga aaagtctcac 

3 01 tcaactgtag ggcacgagcc agccctttcc cggtttacaa atggagaatg aataatgggg 

3 61 acgttgatct cacaagtgat cgatacagta tggfcaggagg aaaccttgtt atcaacaacc 

421 ctgacaaaca gaaagatgct ggaatatact actgtttagc atctaataac tacgggatgg 

481 tcagaagcac tgaagcaacc ctgagctttg gatatcttga tcctttccca cctgaggaac 

541 gtcctgaggt cagagtaaaa gaagggaaag gaatggtgct tctctgtgac cccccatacc 

601 attttccaga tgatcttagc tatcgctggc ttctaaatga atttcctgta tttatcacaa 

6 61 tggataaacg gcgatttgtg tctcagacaa atggcaatct ctacattgca aatgttgagg 

721 cttccgacaa aggcaattat tcctgctttg tttccagtcc ttctattaca aagagcgtgt 

781 tcagcaaatt catcccactc attccaatac ctgaacgaac aacaaaacca tatcctgctg 

841 atattgtagt tcagttcaag gatgtatatg cattgatggg ccaaaatgtg accttagaat 

9 01 gttttgcact tggaaatcct gttccggata tccgatggcg gaaggttcta gaaccaatgc 

961 caagcactgc tgagattagc acctctgggg ctgttcttaa gatcttcaat attcagctag 

1021 aagatgaagg catctatgaa tgtgaggctg agaacattag aggaaaggat aaacatcaag 

10 81 caagaattta tgttcaagca ttccctgagt gggtagaaca catcaatgac acagaggtgg 

1141 acataggcag tgatctctac tggccttgtg tggccacagg aaagcccatc cctacaatcc 

12 01 gatggttgaa aaatggatat gcgtatcata aaggggaatt aagactgtat gatgtgactt 

12 61 ttgaaaatgc cggaatgtat cagtgcatag ctgaaaacac atatggagcc atttatgcaa 
1321 atgctgagtt gaagatcttg gcgttggctc caacttttga aatgaatcct atgaagaaaa 

13 81 agatcctggc tgctaaaggt ggaagggtga taattgaatg caaacctaaa gctgcaccga 
1441 aaccaaagtt ttcatggagt aaagggacag agtggcttgt caatagcagc agaatactca 
15 01 tttgggaaga tggtagcttg gaaatcaaca acattacaag gaatgatgga ggtatctata 
1561 catgctttgc agaaaataac agagggaaag ctaatagcac tggaaccctt gttatcacag 
1621 atcctacgcg aattatattg gccccaatta atgccgatat cacagttgga gaaaacgcca 
1681 ccatgcagtg tgctgcgtcc tttgatcctg ccttggatct cacatttgtt tggtccttca 
1741 atggctatgt gatcgatttt aacaaagaga atattcacta ccagaggaat tttatgctgg 
18 01 attccaatgg ggaattacta atccgaaatg cgcagctgaa acatgctgga agatacacat 
1861 gcactgccca gacaattgtg gacaattctt cagcttcagc tgaccttgta gtgagaggcc 
1921 ctccaggccc tccaggtggt ctgagaatag aagacattag agccacttct gtggcactta 
1981 cttggagccg tggttcagac aatcatagtc ctatttctaa atacactatc cagaccaaga 
2 041 ctattctttc agatgactgg aaagatgcaa agacagatcc cccaattatt gaaggaaata 
2101 tggaggcagc aagagcagtg gacttaat cc catggatgga gtatgaattc cgcgtggtag 
2161 caaccaatac actgggtaga ggagagccca gtataccatc taacagaatt aaaacagacg 
2221 gtgctgcacc aaatgtggct ccttcagatg taggaggtgg aggtggaaga aacagagagc 
22 81 tgaccataac atgggcgcct ttgtcaagag aataccacta tggcaacaat tttggttaca 
2341 tagtggcatt taagccattt gatggagaag aatggaaaaa agtcacagtt actaatcctg 
2401 atactggccg atatgtccat aaagatgaaa ccatgagccc ttccactgca tttcaagtta 
2461 aagtcaaggc cttcaacaac aaaggagatg gaccttacag cctagtagca gtcattaatt 
2521 cagcacaaga cgctcccagt gaagccccaa cagaagtagg tgtaaaagtc ttatcatctt 
2 5 81 ctgagatatc tgttcattgg gaacatgttt tagaaaaaat agtggaaagc tatcagattc 
2641 ggtattgggc tgcccatgac aaagaagaag ctgcaaacag agttcaagtc accagccaag 
2 701 agtactcggc caggctcgag aaccttctgc cagacaccca gtattttata gaagtcgggg 
2761 cctgcaatag tgcagggtgt ggacctccaa gtgacatgat tgaggctttc accaagaaag 
2 821 cacctcctag ccagcctcca aggatcatca gfctcagtaag gtctggttca cgctatataa 
2 8 81 tcacctggga tcatgtcgtt gcactatcaa atgaatctac agtgacggga tataaggtac 

2 941 tctacagacc tgatggccag catgatggca agctgtattc aactcacaaa cactccatag 

3 0 01 aagtcccaat ccccagagat ggagaatacg ttgtggaggt tcgcgcgcac agtgatggag 
3 061 gagatggagt ggtgtctcaa gtcaaaattt caggtgcacc caccctatcc ccaagtcttc 
3121 tcggcttact gctgcctgcc tttggcatcc ttgtctactt ggaattctga atgtgttgtg 
3181 acagctgctg ttcccatccc agctcagaag acacccttca accctgggat gaccacaatt 
3241 ccttccaatt tctgcggctc catcctaagc caaataaatt atactttaac aaactattca 
33 01 actgatttac aacacacatg atgactgagg cattcgggaa ccccttcatc caaaagaata 
3361 aacttttaaa tggatataaa tgatttttaa ctcgttccaa tatgccttat aaaccactta 
3421 acctgat (SEQ ID NO: 87) 
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CONT (NM_001843) 
MKMWLLVSHLVIISITTCLAEFTWYRRYGHGVSEEDKGFGPIFE 
EQPINTIYPEESLEGKVSLNCRARASPFPVYKM^ 

PDKQKDAGIYYCLASNNYGIWRSTEATLSFGYLDPFPPEERPEVRVKEGKGMVLLCDP 
PYHFPDDLS YRWLLTSTEFPVFI TMDKRRFVSQTNGNLYI ANVEASDKGNYSCFVS S PS I 
TKSVFSKF I PL I P I PE RTTKP YPAD I WQFKD VYALMGQNVTL ECFALGNPVPD I RWR 
KVLEPMPSTAE I STSGAVLKI FNI QLEDEGI YECEAENI RGKDKHQARI YVQAFPEWV 
EH I NDTEVD I GS DL YWP CVATGKP I PT I RWLKNGYAYHKGELRLYDVTFENAGMYQC I 
AENTYGAIYANAELKILALAPTFEMNPMKKKIIiAAKGGRVIIECKPKAAPKPKFSWSK 
GTEWIiWSSRILIWEDGSLEIlSrNITRNDGGIYTCFAENNRGKANSTGTLVITDPTRII 
LAP INAD I TVGENATMQCAAS FDPALDLTFVWSFNGYVI DFNKENIHYQRNFMLDSNG 
ELLIRNAQLKHAGRYTCTAQTIVDNSSASADLWRGPPGPPGGLRIEDIRATSVALTW 
S RGS DNH S P I S KYT I Q TKT I L S DDWKD AKTD P P 1 1 EGNME AARAVDL I P WME YE FR W 
ATNTLGRGEPSIPSNRIKTDGAAPNVAPSDVGGGGGRNRELTITWAPLSREYHYGNNF 
GYIVAFKPFDGEEWKXVTVTNPDTGRYVHKDETMSPSTAFQVKVKAFNNKGDGPYSLV 
AVINSAQDAPSEAPTEVGVKVLSSSEISVHWEHVLEKIVESYQIRYWAAHDKEEAANR 
VQVTSQEYSARLENLLPDTQYFIEVGACNSAGCGPPSDMIEAFTKKAPPSQPPRIISS 
VRSGSRYI I TWDHWALSNESTVTGYKVLYRPDGQHDGKL YSTHKHS I EVP I PRDGE Y 
WEVRAHSDGGDGWSQVKISGAPTLSPSLLGLLLPAFGILVYLEF (SEQ ID NO: 88) 
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Osteopontin (NM_0 0 0582) 

1 ctccctgtgt tggtggagga tgtctgcagc agcatttaaa ttctgggagg gcttggttgt 

61 cagcagcagc aggaggaggc agagcacagc atcgtcggga ccagactcgt ctcaggccag 
121 ttgcagcctt ctcagccaaa cgccgaccaa ggaaaactca ctaccatgag aattgcagtg 

181 atttgctttt gcctcctagg catcacctgt gccataccag ttaaacaggc tgattctgga 

241 agttctgagg aaaagcagct ttacaacaaa tacccagatg ctgtggccac atggctaaac 

3 01 cctgacccat ctcagaagca gaatctccta gccccacaga cccttccaag taagtccaac 

3 61 gaaagccatg accacatgga tgatatggat gatgaagatg atgatgacca tgtggacagc 

421 caggactcca ttgactcgaa cgactctgat gatgtagatg acactgatga ttctcaccag 

481 tctgatgagt ctcaccattc tgatgaatct gatgaactgg tcactgattt tcccacggac 

541 ctgccagcaa ccgaagtttt cactccagtt gtccccacag tagacacata tgatggccga 

6 01 ggtgatagtg tggtttatgg actgaggtca aaatctaaga agtttcgcag acctgacatc 

661 cagtaccctg atgctacaga cgaggacatc acctcacaca tggaaagcga ggagttgaat 

721 ggtgcataca aggccatccc cgttgcccag gacctgaacg cgccttctga ttgggacagc 

781 cgtgggaagg acagttatga aacgagtcag ctggatgacc agagtgctga aacccacagc 

841 cacaagcagt ccagattata taagcggaaa gccaatgatg agagcaatga gcattccgat 

901 gtgattgata gtcaggaact ttccaaagtc agccgtgaat tccacagcca tgaatttcac 

9 61 agccatgaag atatgctggt tgtagacccc aaaagtaagg aagaagataa acacctgaaa 

1021 tttcgtattt ctcatgaatt agatagtgca tcttctgagg tcaattaaaa ggagaaaaaa 

10 81 tacaatttct cactttgcat ttagtcaaaa gaaaaaatgc tttatagcaa aatgaaagag 

1141 aacatgaaat gcttctttct cagtttattg gttgaatgtg tatctatttg agtctggaaa 

12 01 taactaatgt gtttgataat tagtttagtt tgtggcttca tggaaactcc ctgtaaacta 

12 61 aaagcttcag ggttatgtct atgttcattc tatagaagaa atgcaaacta tcactgtatt 
1321 ttaatatttg ttattctctc atgaatagaa atttatgtag aagcaaacaa aatactttta 

13 81 cccacttaaa aagagaatat aacattttat gtcactataa tcttttgttt tttaagttag 
1441 tgtatatttt gttgtgatta tctttttgtg gtgtgaataa atcttttatc ttgaatgtaa 
1501 taagaatttg gtggtgtcaa ttgcttattt gttttcccac ggttgtccag caattaataa 
1561 aacataacct tttttactgc ctaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaa (SEQ 

ID NO:89) 
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Os t eopont in (NM_0 0 05 82) 
MR I AVI CFCLLGI TCAI PVKQADSGS S EEKQL YNKYPDAVATWL 

NPDPSQKQNLLAPQTLPSKSNESHDHMDDMDDEDDDDHVDSQDSIDSNDSDDVDDTDD 
SHQSDESHHSDESDELVTDFPTDLPATEVFTPWPTVDTYDGRGDSWYGLRSKSKKF 
RRPDIQYPDATDEDITSHMESEEIiNGAYKAIPVAQDLNAPSDWDSRGKDSYETSQLDD 
QSAETHSHKQSRLYKRKANDESNEHSDVIDSQELSKVSREFHSHEFHSHEDMLWDPK 
SKEEDKHLKFRISHELDSASSEVN (SEQ ID NO: 90) 
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Galectin 8 (NM_006499) 

1 tggacttgga tccgaggcag acgaggaagc tgagaaaacc ctggcgttga ccccgtggac 
61 ctgggcgccc cgggaaggtc cagcgcttgg tccaggcagg cggggatgtg cggtgaccac 
121 cctggtcctg aaaagtccag ccccgaatct ccatccctcc tagacctgga ggcctggaac 
181 agccagccgc ccacggacgc cagagccggg aaccctgacg gcacttagct gctgacaaac 
241 aacctgctcc gtggacgcct gaaacaccag tctttggggc cagtgcctca gtttcaatcc 
3 01 aggtaacctt taaatgaaac ttgcctaaaa tcttaggtca tacacagaag agactccaat 
3 61 cgacaagaag ctggaaaaga atgatgttgt ccttaaacaa cctacagaat atcatctata 
421 acccggtaat cccgtatgtt ggcaccattc ccgatcagct ggatcctgga actttgattg 
481 tgatatgtgg gcatgttcct agtgacgcag acagattcca ggtggatctg cagaatggca 
541 gcagtgtgaa acctcgagcc gatgtggcct ttcatttcaa tcctcgtttc aaaagggccg 
601 gctgcattgt ttgcaatact ttgataaatg aaaaatgggg acgggaagag atcacctatg 
661 acacgccttt caaaagagaa aagtcttttg agatcgtgat tatggtgcta aaggacaaat 
721 tccaggtggc tgtaaatgga aaacatactc tgctctatgg ccacaggatc ggcccagaga 
781 aaatagacac tctgggcatt tatggcaaag tgaatattca ctcaattggt tttagcttca 
841 gctcggactt acaaagtacc caagcatcta gtctggaact gacagagata agtagagaaa 
901 atgttccaaa gtctggcacg ccccagcttc agactgtctc tccctcctgg gatttacagg 
9 61 gtcatggctc tgaaacattc tgtagtgttc tttggacacg agttttcctg gagatcgctt 
1021 tctgcaggcc tattggtctg actgtggctt cttttcagag cctgccattc gctgcaaggt 
10 81 tgaacacccc catgggccct ggacgaactg tcgtcgttaa aggagaagtg aatgcaaatg 
1141 ccaaaagctt taatgttgac ctactagcag gaaaatcaaa ggatattgct ctacacttga 
12 01 acccacgcct gaatattaaa gcatttgtaa gaaattcttt tcttcaggag tcctggggag 

12 61 aagaagagag aaatattacc tctttcccat ttagtcctgg gatgtacttt gagatgataa 
1321 tttactgtga tgttagagaa ttcaaggttg cagtaaatgg cgtacacagc ctggagtaca 

13 81 aacacagatt taaagagctc agcagtattg acacgctgga aattaatgga gacatccact 
1441 tactggaagt aaggagctgg tagcctacct acacagctgc tacaaaaacc aaaatacaga 
1501 atggcttctg tgatactggc cttgctgaaa cgcatctcac tgtcattcta ttgtttatat 
1561 tgttaaaatg agcttgtgca ccattagatc ctgctgggtg ttctcagtcc ttgccatgaa 
1621 gtatggtggt gtctagcact gaatggggaa actgggggca gcaacactta tagccagtta 
1681 aagccactct gccctctctc ctactttggc tgactcttca agaatgccat tcaacaagta 
1741 tttatggagt acctactata atacagtagc taacatgtat tgagcacaga ttttttttgg 
1801 taaaactgtg aggagctagg atatatactt ggtgaaacaa accagtatgt tccctgttct 
1861 cttgagcttc gactcttctg tgctctattg ctgcgcactg ctttttctac aggcattaca 
1921 tcaactccta aggggtcctc tgggattagt taagcagcta ttaaatcacc cgaagacact 
1981 aatttacaga agacacaact ccttccccag tgatcactgt cataaccagt gctctaccgt 
2 041 atcccatcac tgaggactga tgttgactga catcatttta tcgtaataaa catgtggctc 
2101 tattagctgc aagctttacc aagtaattgg catgacatct gagcacagaa attaaggcaa 
2161 aaaaccaaag caaaacaaat acatggtgct gaaattaact tgatgccaag cccaaggcag 
2221 ctgatttctg tgtatttgaa cttagggcaa atcagagtct acacagacgc ctacagaaag 
22 81 tttcaggaag aggcaagatg cattcaattt gaaagatatt tatgggcaac aaagtaaggt 
2341 caggattaga cttcaggcat tcataaggca ggcactatca gaaagtgtac gccaactaag 
2401 ggacccacaa agcaggcaga ggtaatgcag aaatctgttt tgttcccatg aaatcaccaa 
2461 tcaaggcctc cgttcttcta aagattagtc catcatcatt agcaactgag atcaaagcac 
2521 tcttccactt tacgtgatta aaatcaaacc tgtatcagca aaaaaaaaaa aaaaaaaaaa 
2581 aaaaaaaaaa aaa (SEQ ID NO: 91) 
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Galectin 8 (NM_006499) 
MLSLNNLQNI IYNPVI PYVGTI PDQLDPGTLI VI CGHVPSDADR 

FQVDLQNGSSVKPRADVAFHFNPRFKRAGCIVCNTLINEKWGREEITYDTPFKREKSF 
E I VIMVLKI)KFQ VAVNGKHTLL YGHRI GPEKI DTLGI YGKVNI HS I GFS FS SDLQSTQ 
ASSLELTEISRENVPKSGTPQLQTVSPSWDLQGHGSETFCSVLWTRVFLEIAFCRPIG 
LTVASFQSLPFAARLNTPMGPGRTVVVKGEVNANAK^ 

NIKAFVRNSFLQESWGEEERNITSFPFSPGMYFEMIIYCDVREFKVAVlsrGVHSLEYKH 
RFKELS S I DTLE INGD I HLLEVRSW (SEQ ID NO: 92) 
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PGSl (bihlycan, NM_001711) 

1 agcctcccgc ccgccgcctc tgtctccctc tctccacaaa ctgcccagga gtgagtagct 
61 gctttcggtc cgccggacac accggacaga tagacgtgcg gacggcccac caccccagcc 
121 cgccaactag tcagcctgcg cctggcgcct cccctctcca ggtccatccg ccatgtggcc 
181 cctgtggcgc ctcgtgtctc tgctggccct gagccaggcc ctgccctttg agcagagagg 
241 cttctgggac ttcaccctgg acgatgggcc attcatgatg aacgatgagg aagcttcggg 
3 01 cgctgacacc tcgggcgtcc tggacccgga ctctgtcaca cccacctaca gcgccatgtg 
3 61 tcctttcggc tgccactgcc acctgcgggt ggttcagtgc tccgacctgg gtctgaagtc 
421 tgtgcccaaa gagatctccc ctgacaccac gctgctggac ctgcagaaca acgacatctc 
481 cgagctccgc aaggatgact tcaagggtct ccagcacctc tacgccctcg tcctggtgaa 
541 caacaagatc tccaagatcc atgagaaggc cttcagccca ctgcggaagc tgcagaagct 
601 ctacatctcc aagaaccacc tggtggagat cccgcccaac ctacccagct ccctggtgga 
661 gctccgcatc cacgacaacc gcatccgcaa ggtgcccaag ggagtgttca gcgggctccg 
721 gaacatgaac tgcatcgaga tgggcgggaa cccactggag aacagtggct ttgaacctgg 
781 agccttcgat ggcctgaagc tcaactacct gcgcatctca gaggccaagc tgactggcat 
841 ccccaaagac ctccctgaga ccctgaatga actccaccta gaccacaaca aaatccaggc 
901 catcgaactg gaggacctgc ttcgctactc caagctgtac aggctgggcc taggccacaa 
961 ccagatcagg atgatcgaga acgggagcct gagcttcctg cccaccctcc gggagctcca 
1021 cttggacaac aacaagttgg ccagggtgcc ctcagggctc ccagacctca agctcctcca 
1081 ggtggtctat ctgcactcca acaacatcac caaagtgggt gtcaacgact tctgtcccat 
1141 gggcttcggg gtgaagcggg cctactacaa cggcatcagc ctcttcaaca accccgtgcc 

12 01 ctactgggag gtgcagccgg ccactttccg ctgcgtcact gaccgcctgg ccatccagtt 
1261 tggcaactac aaaaagtaga ggcagctgca gccaccgcgg ggcctcagtg ggggtctctg 
1321 gggaacacag ccagacatcc tgatggggag gcagagccag gaagctaagc cagggcccag 

13 81 ctgcgtccaa cccagccccc cacctcgggt ccctgacccc agctcgatgc cccatcaccg 
1441 cctctccctg gctcccaagg gtgcaggtgg gcgcaaggcc cggcccccat cacatgttcc 
1501 cttggcctca gagctgcccc tgctctccca ccacagccac ccagaggcac cccatgaagc 
1561 ttttttctcg ttcactccca aacccaagtg tccaaggctc cagtcctagg agaacagtcc 
1621 ctgggtcagc agccaggagg cggtccataa gaatggggac agtgggctct gccagggctg 
1681 ccgcacctgt ccagacacac atgttctgtt cctcctcctc atgcatttcc agcctttcaa 
1741 ccctccccga ctctgcggct cccctcagcc cccttgcaag ttcatggcct gtccctccca 
1801 gacccctgct ccactggccc ttcgaccagt cctcccttct gttctctctt tccccgtcct 
1861 tcctctctct ctctctctct ctctctctct ctttctgtgt gtgtgtgtgt gtgtgtgtgt 
1921 gtgtgtgtgt gtgtgtgtgt cttgtgcttc ctcagacctt tctcgcttct gagcttggtg 
1981 gcctgttccc tccatctctc cgaacctggc ttcgcctgtc cctttcactc cacaccctct 
2 041 ggccttctgc cttgagctgg gactgctttc tgtctgtccg gcctgcaccc agcccctgcc 
2101 cacaaaaccc cagggacagc ggtctcccca gcctgccctg ctcaggcctt gcccccaaac 
2161 ctgtactgtc ccggaggagg ttgggaggtg gaggcccagc atcccgcgca gatgacacca 
2221 tcaaccgcca gagtcccaga caccggtttt cctagaagcc cctcaccccc actggcccac 
2281 tggtggctag gtctcccctt atccttctgg tccagcgcaa ggaggggctg cttctgaggt 
2341 cggtggctgt ctttccatta aagaaacacc gtgcaacgtg aaaaaaaaaa aaaaaaaaaa 
2401 a (SEQ ID NO: 93) 
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PGSl (bihlycan, NM_0017ll) 
MWPLWRLVSLLAIiSQALPFEQRGFWDFTLDDGPFMMNDEEASGA 

DTSGVLDPDSVTPTYSAMCPFGCHCHLRWQCSDLGLKSVPKEISPDTTLLDLQNNDI 
SELRKDDFKGLQHLYALVLV^^ 

LVEXjRI HDNRI RKVPKGVF S GLRNMNC I EMGGNPLENS GFE PGAFDGLKIiNYLR I S EA 
KZiTGIPK^LPETLNELHLDHNKIQAIELEDLLRYSKIiYRLGIaGHNQIRMIENGSIiSFL 
PTLREIjHLDNNKIjARVP S GL PDLKLLQ WYLH SNNI TKVGVNDF C PMGFGVKRAY YNG 
ISLFNNPVPYWEVQPATFRCVTDRLAIQFGNYKK (SEQ ID NO: 94) 
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Frizzled 2 (NM__001466) 

1 cgagtaaagt ttgcaaagag gcgcgggagg cggcagccgc agcgaggagg cggcggggaa 
61 gaagcgcagt ctccgggttg ggggcggggg cggggggggc gccaaggagc cgggtggggg 
121 gcggcggcca gcatgcggcc ccgcagcgcc ctgccccgcc tgctgctgcc gctgctgctg 
181 ctgcccgccg ccgggccggc ccagttccac ggggagaagg gcatctccat cccggaccac 
241 ggcttctgcc agcccatctc catcccgctg tgcacggaca tcgcctacaa ccagaccatc 
3 01 atgcccaacc ttctgggcca cacgaaccag gaggacgcag gcctagaggt gcaccagttc 
361 tatccgctgg tgaaggtgca gtgctcgccc gaactgcgct tcttcctgtg ctccatgtac 
421 gcacccgtgt gcaccgtgct ggaacaggcc atcccgccgt gccgctctat ctgtgagcgc 
481 gcgcgccagg gctgcgaagc cctcatgaac aagttcggtt ttcagtggcc cgagcgcctg 
541 cgctgcgagc acttcccgcg ccacggcgcc gagcagatct gcgtcggcca gaaccactcc 
6 01 gaggacggag ctcccgcgct actcaccacc gcgccgccgc cgggactgca gccgggtgcc 
661 gggggcaccc cgggtggccc gggcggcggc ggcgctcccc cgcgctacgc cacgctggag 
721 caccccttcc actgcccgcg cgtcctcaag gtgccatcct atctcagcta caagtttctg 
781 ggcgagcgtg attgtgctgc gccctgcgaa cctgcgcggc ccgatggttc catgttcttc 
841 tcacaggagg agacgcgttt cgcgcgcctc tggatcctca cctggtcggt gctgtgctgc 
901 gcttccacct tcttcactgt caccacgtac ttggtagaca tgcagcgctt ccgctaccca 
961 gagcggccta tcatttttct gtcgggctgc tacaccatgg tgtcggtggc ctacatcgcg 
1021 ggcttcgtgc tccaggagcg cgtggtgtgc aacgagcgct tctccgagga cggttaccgc 
1081 acggtggtgc agggcaccaa gaaggagggc tgcaccatcc tcttcatgat gctctacttc 
1141 ttcagcatgg ccagctccat ctggtgggtc atcctgtcgc tcacctggtt cctggcagcc 
1201 ggcatgaagt ggggccacga ggccatcgag gccaactctc agtacttcca cctggccgcc 
1261 tgggccgtgc cggccgtcaa gaccatcacc atcctggcca tgggccagat cgacggcgac 
1321 ctgctgagcg gcgtgtgctt cgtaggcctc aacagcctgg acccgctgcg gggcttcgtg 
13 81 ctagcgccgc tcttcgtgta cctgttcatc ggcacgtcct tcctcctggc cggcttcgtg 
1441 tcgctcttcc gcatccgcac catcatgaag cacgacggca ccaagaccga aaagctggag 
1501 cggctcatgg tgcgcatcgg cgtcttctcc gtgctctaca cagtgcccgc caccatcgtc 
1561 atcgcttgct acttctacga gcaggccttc cgcgagcact gggagcgctc gtgggtgagc 
1621 cagcactgca agagcctggc catcccgtgc ccggcgcact acacgccgcg catgtcgccc 
1681 gacttcacgg tctacatgat caaatacctc atgacgctca tcgtgggcat cacgtcgggc 
1741 ttctggatct ggtcgggcaa gacgctgcac tcgtggagga agttctacac tcgcctcacc 
1801 aacagccgac acggtgagac caccgtgtga gggacgcccc caggccggaa ccgcgcggcg 
1861 ctttcctccg cccggggtgg ggcccctaca gactccgtat tttatttttt taaataaaaa 
1921 acgatcgaaa ccatttcact tttaggttgc tttttaaaag agaactctct gcccaacacc 
1981 CCC (SEQ ID NO:95) 
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Frizzled 2 (NM__0 01466) 
MRPRSALPRLLLPLLLLPAAGPAQFHGEKGI SI PDHGPCQPI SI 

PliCTDIAYNQTIMPNLLGHTNQEDAGLEVHQFYPLVKVQCSPELRFFLCSMYAPVCTV 
LEQAI PPCRS I CERARQGCEALMNKFGFQWPERLRCEHFPRHGAEQI CVGQNHSEDGA 
PALLTTAPPPGIiQPGAGGTPGGPGGGGAPPRYATLEHPFHCPRVLKVPSYLSYICFLGE 
RDCAAPCEPARPDGSMFFSQEETRFARLWILTWSVLCCASTFFTVTTYLVDMQRFRYP 
ERPIIFLSGCYTMVSVAYIAGFVLQERWCNERFSEDGYRTWQGTKKEGCTILFMML 
YFFSMASSIWWVILSLTWFLAAGMKWGHEAIEANSQYFHIJ^WAVPAVKTITILAMGQ 
IDGDLLSGVCFVGLNSLDPLRGFVLAPLFVYLFIGTSFLLAGFVSLFRIRTIMKHDGT 
KTEKLERLMVRIGVFSVLYTVPATIVIACYFYEQAFREHWERSWVSQHCKSIiAIPCPA 

HYTPRMSPDFTVYMIKYLMTLIVGITSGFWIWSGKTLHSWRKFYTRLTNSRHGETTV (SEQ ID NO: 96) 
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I SLR (NM_005545) 

1 aagcagttgt tttgctggaa ggagggagtg cgcgggctgc cccgggctcc tccctgccgc 
61 ctcctctcag tggatggttc caggcaccct gtctggggca gggagggcac aggcctgcac 
121 atcgaaggtg gggtgggacc aggctgcccc tcgccccagc atccaagtcc tcccttgggc 
181 gcccgtggcc ctgcagactc tcagggctaa ggtcctctgt tgctttttgg ttccacctta 
241 gaagaggctc cgcttgacta agagtagctt gaaggaggca ccatgcagga gctgcatctg 

3 01 ctctggtggg cgcttctcct gggcctggct caggcctgcc ctgagccctg cgactgtggg 
361 gaaaagtatg gcttccagat cgccgactgt gcctaccgcg acctagaatc cgtgccgcct 
421 ggcttcccgg ccaatgtgac tacactgagc ctgtcagcca accggctgcc aggcttgccg 

4 81 gagggtgcct tcagggaggt gcccctgctg cagtcgctgt ggctggcaca caatgagatc 
541 cgcacggtgg ccgccggagc cctggcctct ctgagccatc tcaagagcct ggacctcagc 
601 cacaatctca tctctgactt tgcctggagc gacctgcaca acctcagtgc cctccaattg 
661 ctcaagatgg acagcaacga gctgaccttc atcccccgcg acgccttccg cagcctccgt 
721 gctctgcgct cgctgcaact caaccacaac cgcttgcaca cattggccga gggcaccttc 
781 accccgctca ccgcgctgtc ccacctgcag atcaacgaga accccttcga ctgcacctgc 
841 ggcatcgtgt ggctcaagac atgggccctg accacggccg tgtccatccc ggagcaggac 
9 01 aacatcgcct gcacctcacc ccatgtgctc aagggtacgc cgctgagccg cctgccgcca 
961 ctgccatgct cggcgccctc agtgcagctc agctaccaac ccagccagga tggtgccgag 

1021 ctgcggcctg gttttgtgct ggcactgcac tgtgatgtgg acgggcagcc ggcccctcag 
10 81 cttcactggc acatccagat acccagtggc attgtggaga tcaccagccc caacgtgggc 
1141 actgatgggc gtgccctgcc tggcacccct gtggccagct cccagccgcg cttccaggcc 
1201 tttgccaatg gcagcctgct tatccccgac tttggcaagc tggaggaagg cacctacagc 

12 61 tgcctggcca ccaatgagct gggcagtgct gagagctcag tggacgtggc actggccacg 
1321 cccggtgagg gtggtgagga cacactgggg cgcaggttcc atggcaaagc ggttgaggga 

13 81 aagggctgct atacggttga caacgaggtg cagccatcag ggccggagga caatgtggtc 
1441 atcatctacc tcagccgtgc tgggaaccct gaggctgcag tcgcagaagg ggtccctggg 
1501 cagctgcccc caggcctgct cctgctgggc caaagcctcc tcctcttctt cttcctcacc 
1561 tccttctagc cccacccagg gcttccctaa ctcctcccct tgcccctacc . aatgcccctt 
1621 taagtgctgc aggggtctgg ggttggcaac tcctgaggcc tgcatgggtg acttcacatt 
1681 ttcctacctc tccttctaat ctcttctaga gcacctgcta tccccaactt ctagacctgc 
1741 tccaaactag tgactaggat agaatttgat cccctaactc actgtctgcg gtgctcattg 
1801 ctgctaacag cattgcctgt gctctcctct caggggcagc atgctaacgg ggcgacgtcc 
1861 taatccaact gggagaagcc tcagtggtgg aattccaggc actgtgactg tcaagctggc 
1921 aagggccagg attgggggaa tggagctggg gcttagctgg gaggtggtct gaagcagaca 
19 81 gggaatggga gaggaggatg ggaagtagac agtggctggt atggctctga ggctccctgg 
2 041 ggcctgctca agctcctcct gctccttgct gttttctgat gatttggggg cttgggagtc 
2101 cctttgtcct catctgagac tgaaatgtgg ggatccagga tggcttcctt cctcttaccc 
2161 ttcctccctc agcctgcaac ctctatcctg gaacctgtcc tccctttctc cccaactatg 
2221 catctgttgt ctgctcctct gcaaaggcca gccagcttgg gagcagcaga gaaataaaca 
2281 gcatttctga tgcc (SEQ ID NO: 97) 
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ISLR (NM_005545) 
MQELHLLWWAIiLLGLAQACPEPCDCGEKYGFQlADCAYRDLESV 

PPGFPAlTVTTLSLSANRLPGLPEGAFREVPLLQSLWIiAHNEIRTVAAGALASLSHLKS 
LDLSHNLI SDFAWSDLHNLSALQLLKMDSNELTFI PRDAFRSLRALRSLQLNHNRLHT 
LAEGTFTPLTALSHLQINENPFDCTCGIVWLKTWALTTAVSIPEQDNIACTSPHVLKG 
TPLSRLPPLPCSAPSVQLSYQPSQDGAEIiRPGFVLALHCDVDGQPAPQLHWHIQIPSG 
IVEITSPNVGTDGRALPGTPVASSQPRFQAFANGSLLIPDFGKLEEGTYSCLATNELG 
SAESSVDVAIATPGEGGEDTLGRRFHGKAVEGKGCYTVDNEVQPSGPEDNWIIYLSR 
AGNPEAAVAEGVPGQLPPGLLLLGQSLLLFFFLTSF (SEQ ID NO: 98) 
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FLJ233 99 (NM_022763) 

1 tgacccggtc cgtgtgggcc agcgggaagg aagccagttg agggaagttc tccatgaatg 
61 tacgtcacaa tgatgatgac cgaccaaatc cctctggaac tgccaccatt gctgaacgga 
121 gaggtagcca tgatgcccca cttggtgaat ggagatgcag ctcagcaggt tattctcgtt 
181 caagttaatc caggtgagac tttcacaata agagcagagg atggaacact tcagtgcatt 
241 caaggacctg ctgaagttcc catgatgtca cccaatggat ccattcctcc cattcatgtg 
3 01 cctccaggtt atatctcaca ggtgattgaa gatagtactg gagtccgccg ggtggtggtc 
361 acaccccagt ctcctgagtg ttatccccca agctacccct cagccatgtc tccaacccat 
421 catctccctc cctatctgac tcaccatcca cattttattc ataactcaca cacggcttac 
481 tacccacctg ttaccggacc tggagatatg ccgcctcagt tttttcccca gcatcatctt 
541 ccccacacaa tatatggtga gcaagaaatt ataccatttt atggaatgtc aagctacatc 
6 01 acccgagaag accagtacag caagcctccg cacaaaaaac tgaaagaccg ccagatcgat 
661 cgccagaacc gactcaacag acctccttct gctatctaca aaagcagctg cacaacagta 
721 tacaatggct: atgggaaggg ccatagtggt ggaagtggcg gaggcggcag cggtagtggt 
781 cccggaatta agaaaacaga gcgacgagca agaagcagcc caaagtcgaa tgattcagac 
841 ttgcaagaat atgagttgga agtaaagagg gtgcaagaca ttctttcggg aatagagaaa 
9 01 ccacaggttt ctaatattca ggcaagagca gttgtgttgt cctgggctcc ccctgttgga 
961 ctttcctgtg gaccccacag tggtctttcc ttcccctaca gttacgaggt ggccttatca 
1021 gacaaaggac gagatggaaa atacaagata atttacagtg gagaagaatt agaatgtaac 
1081 ctgaaagatc ttagaccagc aacagattat catgtgaggg tgtatgccat gtacaattcc 
1141 gtaaagggat cctgctccga gcctgttagc ttcaccaccc acagctgtgc acccgagtgt 
12 01 cctttccccc ctaagctggc acataggagc aaaagttcac taaccctgca gtggaaggca 

12 61 ccaattgaca acggttcaaa aatcaccaac taccttttag agtgggatga gggaaaaaga 
1321 aatagtggtt tcagacagtg cttcttcggg agccagaagc actgcaagtt gacaaagctt 

13 81 tgtccggcaa tggggtacac attcaggctg gccgctcgaa acgacattgg taccagtggt 
1441 tatagccaag aggtggtgtg ctacacatta ggaaatatcc ctcagatgcc ttctgcacca 
1501 aggctggttc gagctggcat cacatgggtc acgttgcagt ggagtaagcc agaaggctgt 
15 61 tcacccgagg aagtgatcac ctacaccttg gaaattcagg aggatgaaaa tgataacctt 
1621 ttccacccaa aatacactgg agaggattta acctgtactg tgaaaaatct caaaagaagc 
1681 acacagtata cattcaggct gactgcttct aatacggaag gaaaaagctg tccaagcgaa 
1741 gttcttgttt gtacgacgag tcctgacagg cctggacctc ctaccagacc gcttgtcaaa 
1801 ggcccagtta catctcatgg ctttagtgtc aaatgggatc cccctaagga caatggtggt 
18 61 tcagaaatcc tcaagtactt gctagagatt actgatggaa attctgaagc gaatcagtgg 
1921 gaagtggcct acagtgggtc ggctaccgaa tacaccttca cccacttgaa accaggcact 
1981 ttgtacaaac tccgagcatg ctgcatcagt accggcggac acagccagtg ttctgaaagt 
2041 ctccctgttc gcacactaag cattgcacca ggtcaatgtc gaccaccgag ggttttgggt 
2101 agaccaaagc acaaagaagt ccacttagag tgggatgttc ctgcatcgga aagtggctgt 
2161 gaggtctcag agtacagcgt ggagatgacg gagcccgaag acgtagcctc ggaagtgtac 
2221 catggcccag agctggagtg caccgtcggc aacctgcttc ctggaaccgt gtatcgcttc 
22 81 cgggtgaggg ctctgaatga tggagggtat ggtccctatt ctgatgtctc agaaattacc 
2341 actgctgcag ggcctcctgg acaatgcaaa gcaccttgta tttcttgtac acctgatgga 
2401 tgtgtcttag fcgggttggga gagtcctgat agttctggtg ctgacatctc agagtacagg 
2461 ttggaatggg gagaagatga agaatcctta gaactcattt atcatgggac agacacccgt 
2521 tttgaaataa gagacctgtt gcctgctgca cagtattgct gtagactaca ggccttcaat 
25 81 caagcagggg cagggccgta cagtgaactt gtcctttgcc agacgccagc gtctgcccct 
2641 gaccccgtct ccactctctg tgtcctggag gaggagcccc ttgatgccta ccctgattca 
2701 ccttctgcgt gccttgtact gaactgggaa gagccgtgca ataacggatc tgaaatcctt 
2761 gcttacacca ttgatctagg agacactagc attaccgtgg gcaacaccac catgcatgtt 
2 821 atgaaagatc tccttccaga aaccacctac cggatcagaa ttcaggctat aaatgaaatt 
2881 ggagctggac catttagtca gttcattaaa gcaaaaactc ggccattacc acccttgcct 

2 941 cctaggctag aatgtgctgc tgctggtcct cagagcctga agctaaaatg gggagacagt 

3 0 01 aactccaaga cacatgctgc tgaggacatt gtgtacacac tacagctgga ggacagaaac 
3061 aagaggttta tttcaatcta cagaggaccc agccacacct acaaggtcca gagactgacg 
3121 gaattcacat gctactcctt cagaatccag gcagcaagcg aggctggaga agggcccttc 
3181 tcagaaacct ataccttcag cacaaccaaa agtgtccccc ccaccatcaa agcacctcga 
3241 gtaacacagt tagaaggaaa ttcatgtgaa attttatggg agacggtacc atcaatgaaa 
33 01 ggtgaccctg ttaactacat tctgcaggta ttggttggaa gagaatctga gtacaaacag 
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3 3 61 gtgtacaagg gagaagaagc cacattccaa atctcaggcc tccagaccaa cacagactac 
3421 aggttccgcg tatgtgcgtg tcgtcgctgt ttagacacct ctcaggagct aagcggagcc 
34 81 ttcagcccct ctgcggcttt tgtattacaa cgaagtgagg tcatgcttac aggggacatg 
3 541 gggagcttag atgatcccaa aatgaagagc atgatgccta ctgatgaaca gtttgcagcc 
3601 atcattgtgc ttggctttgc aactttgtcc attttatttg cctttatatt acagtacttc 
3 6 61 ttaatgaagt aaacccaaca aaactagagg tatgaattaa tgctacacat tttaatacac 
3721 acatttattc agatactccc ctttttaaag cccttttgtt ttttgattta tatactctgt 
3 7 81 tttacagatt tagctagaaa aaaaatgtca gtgttttggt gcaccttttt gaaatgcaaa 
3 841 actaggaaaa ggttaaactg gatttttttt tttaaaaaaa agaaaaaaaa agaagaaaag 

3 901 tataccagat accaaaagct agctttctta tgttttcctt taaattttca gatttacctt 
3961 cattctgttt tcactgatgt cttttgcaag cctttgattt tttttttttt gttacagttt 
4021 agtaatttat attcaccagt cacttcatat gtcttgaaca tctgtatctg taaacatgaa 
4081 tcaccgtgtg tgtacttaca gggctaggat ttcagtgttg tcagagtatt accacacagc 
4141 aacagcaaca tacagaagat atgttcactc agataagact gccctaaaca accattttgt 

42 01 cactcagtta tttaactgtg tttagctcat ttaaatcaaa atgtgtactt taatctaaaa 
4261 tgttttaata atctgtattt cttataattt taacactatg agctgcctgt ataagaaatc 
4321 aagtaaccag aatgcaccta taaattatgg agcattgtag attttaccac atcaattcat 

43 81 agcagtaact ttaagagggc attgtgcaat agttagttgt tttcttgttc agctatttta 
4441 aaggctgctt taacttgttt gtttgtcttt gtatataact acttctaatc taatcactag 
4501 agttattata ttctgttatg tttgaccaga attatatgac aagaactggt gacagtttag 
4561 tgcctctgcc cattgtccat gatttacact aattgtgagc agtcttctta tgtgtcagct 
4621 cattattttt gaaacatttg cctttaggct gttctttgag gtatcaatga agtgattgaa 
4681 tttcaatacc ttaattcagt gcacataata ctaatgtaac agcagatgaa aattgataaa 
4741 acccaaaaga gagtcatcta aatttgtagt tcctatttct gtgggtttgc ctggccatgg 

4 8 01 ttggagaggg aatggtgttt gatggtaaac acagggtgtt tggggatcaa ggagcctaga 
4861 ttctctccct ggatctgtca ctaacttgct gcgtgacctg aacacgtcac tttacctctc 

4 921 tgtgcctcag ttttcccatg catgaaaaat aaaataaaat aaaacgggga ttctaatgtt 
4981 tgtaagtgct ttgagatctt tgaccaacag gtgctattgg agtgcaaagt gttactctta 
5041 cgtgtttatt ttgagtcatg agataatcaa ttttaaccca aagtcattgg attatttata 
5101 tgaagtccat aatgttcgag tacctcaggg acatttaaga gttggaggtg caaatatatt 
5161 ccaaaagggt gcaacagaca cagtgtatcc ccctgcttct gtttttgtat atttttgcta 
5221 cttggttttt cttgatcata gctattttgt gcttgatctt tattgtctaa gatgcagtat 
52 81 cctgtactag cttataatat tcccatacca aagtcatggg gaaacaaaca ttattttgtt 
5341 tttggtttat ttatactata ttctgcatac agtactttaa atgccaatta cagtgcaatc 
5401 tttatttatt gtaaaatttt ttaagtgtac ttatgtacta attttccctt gtagcatgtt 
5461 atatttttgt gttttatact tttgtaattt taggtcagtc ttgttccttg gcaacatctg 
5521 tagtattatt aatcttctga cattttctta tgtttttaaa aagataagag catctagtgc 
5581 attaaatgcc aaaaaaaaaa tacattatca gtgattgaaa cgtttacatg tacccaaaaa 
5641 ccataatcat ctcttggaag aaaatgctga gatcaatgaa ttattctgtg tgcctatatt 
5701 gacgtagtga gtactagaga gttctgtatt ttattattga ctataataat tagtttaatt 

5 761 agctttgcaa actgatggca tcaaggtaaa tatatttttg ccaaagttct ggccttccaa 
5821 aactcacccc cttatttaaa tgtgtgctat gacccactat gaccacagca tctgcatttt 
5 8 81 ctaaaaaatt ccatgcaggt gttttgggga gaggtatttt ttaagcaatg aaaattcaac 
5941 tgagtacaaa gccccctctt ggggggttgg ggaagtctct tttttgaaac acttcagaac 
6001 tgctgctata aagaaattct ctaattggtt gaattttttt tttaagtaaa tagtacttta 
6061 ggccaaaatt tatatgaata tttgatcttc ttgagatttt catactatca tttaaccacc 
6121 aggaagctga agtgtgtgaa gtacaaagct gacagcactt tattttattg ctctccatta 
6181 tttggtattc attatattcc ttcagtcaga aaattattac tctctatggc actgtttttt 
6241 atcacaaata tgtatatgtg atattgatat ataactatat atattgccat cacacacgaa 
63 01 caataaaata aagtgttcta ttaacctgat ctctttgccc ttttgctatg tgaggagtga 
63 61 atgagtggcc ttctgatgct ctgactcttc tctgtatgtc aaactcatcc ctggcacaag 
6421 aaattccagt catgtgaagc aaactgccct ttgtcctcaa agaaattgtt gaaaaagaaa 
6481 actttttaaa gagatttttt gcatattctc tgccttgttc ttatcaactt gaaatgttgg 
6541 cattttctaa ccttgttttg ttggctacaa taattcagta ttcatgtcaa aattgagaag 
6601 tgccctaatt gaatgtgttt gaatgttatc cttgcacaat tctttaaatt gaaagataaa 
6661 atgttttacc tcactgttgg acatacattc caagcttttc aactctagga gaaaaagaaa 
6721 atcatgtttt cctgtattgt aaattttaga ctatttcata tacattgtat taaaactgcc 
6781 atatcaattt taatgtatag attttgcaaa tattatgcta tatgtaatac ctaactgtat 
6841 ctgtagtgta tatgtaatat atttatgccc aataaatgtt ttaattcttt ctga (SEQ ID 

NO: 99) 
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FLJ233 99 (NM_022 763) 
MYVTMMMTDQIPIiEXiPPIiIiNGEVAMMPHIiVlsrGDAAQQVILVQVN 

PGETFTI RAEDGTLQC I QGPAEVPMMS PNGS I PP IHVP PGYI SQ VI EDS TGVRR WVT 
PQSPECYPPSYPSAMSPTHHLPPYLTHHPHFIHNSHTAYYPPVTGPGDMPPQFFPQHH 
LPHTIYGEQEIIPFYGMSSYITREDQYSKPPHKKLKDRQIDRQNRLNRPPSAIYKSSC 
TTVYNGYGKGHSGGSGGGGSGSGPGIKKTERRARSSPKSNDSDLQEYELEVKRVQDIL 
SGIEKPQVSNIQARAWLSWAPPVGLSCGPHSGLSFPYSYEVALSDKGRDGKYKIIYS 
GE ELE CNIiKDIiRP ATDYHVRVYAMYN S VKG S C S E P V S FTTH S C APECP F P P KLAHR S K 
SSLTLQWKAPIDNGSKITNYLLEWDEGKRlSrSGFRQCFFGSQKHCKXiTKTiCPAMGYTFR 
LAARNDI GTS GYS QE WCYTXjGN I PQMP S APRLVRAGI TWVTLQW SK.PEGC S PEEVI T 
YTLEIQEDENDNLFHPKYTGEDLTCTVKNIiKRSTQYTFRLTASNTEGKSCPSEVLVCT 
T S PDRPGP PTRPLVKGP VTSHGFS VKWDPPKDNGGS E I LKY LLE I TDGNSEANQWEVA 
YSGSATEYTFTHLKPGTLYKLRAGCISTGGHSQCSESLPVRTLSIAPGQCRPPRVLGR 
PKHKEVHLEWDVPASESGCEVSEYSVEMTEPEDVASEVYHGPELECTVGNLLPGTVYR 
FRVRALNDGGYGPYSDVSEITTAAGPPGQCKAPCISCTPDGCVLVGWESPDSSGADIS 
EYRLEWGEDEESLELIYHGTDTRFEIRDLLPAAQYCCRLQAFNQAGAGPYSELVLCQT 
PASAPDPVSTLCVLEEEPLDAYPDSPSACIiVLNWEEPClSrNGSEILAYTIDLGDTSITV 
GNTTMHVMKDZ.LPETTYRIRIQAINEIGAGPFSQFIKAKTRPLPPLPPRLECAAAGPQ 
SLIOjKWGDSNSKTHAAEDIVYTLQLEDRNKRFISIYRGPSHTYKVQRLTEFTCYSFRI 
QAAS E AGEGPF S ETYTF S TTKS VP PT I KAPRVTQLEGNS CE I L WETVPSMKGDP VN YI 
LQVLVGRESEYKQVYKGEEATFQISGLQTNTDYRFRVCACRRCLDTSQELSGAFSPSA 

AFVLQRSEVMLTGDMGSLDDPKMKSMMPTDEQFAAIIVLGFATLSILFAFILQYFLMK (SEQ ID NO: 100) 
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TEM1 (NM_02 04 04) 

1 tcgcgatgct gctgcgcctg ttgctggcct gggcggccgc agggcccaca ctgggccagg 
61 acccctgggc tgctgagccc cgtgccgcct gcggccccag cagctgctac gctctcttcc 
121 cacggcgccg caccttcctg gaggcctggc gggcctgccg cgagctgggg ggcgacctgg 
181 ccactcctcg gacccccgag gaggcccagc gtgtggacag cctggtgggt gcgggcccag 
241 ccagccggct gctgtggatc gggctgcagc ggcaggcccg gcaatgccag ctgcagcgcc 
3 01 cactgcgcgg cttcacgtgg accacagggg accaggacac ggctttcacc aactgggccc 
3 61 agccagcctc tggaggcccc tgcccggccc agcgctgtgt ggccctggag gcaagtggcg 
421 agcaccgctg gctggagggc tcgtgcacgc tggctgtcga cggctacctg tgccagtttg 
481 gcttcgaggg cgcctgcccg gcgctgcaag atgaggcggg ccaggccggc ccagccgtgt 
541 ataccacgcc cttccacctg gtctccacag agtttgagtg gctgcccttc ggctctgtgg 
601 ccgctgtgca gtgccaggct ggcaggggag cctctctgct ctgcgtgaag cagcctgagg 
661 gaggtgtggg ctggtcacgg gctgggcccc tgtgcctggg gactggctgc agccctgaca 
721 acgggggctg cgaacacgaa tgtgtggagg aggtggatgg tcacgtgtcc tgccgctgca 
781 ctgagggctt ccggctggca gcagacgggc gcagttgcga ggacccctgt gcccaggctc 
841 cgtgcgagca gcagtgtgag cccggtgggc cacaaggcta cagctgccac tgtcgcctgg 
901 gtttccggcc agcggaggat gatccgcacc gctgtgtgga cacagatgag tgccagattg 
961 ccggtgtgtg ccagcagatg tgtgtcaact acgttggtgg cttcgagtgt tattgtagcg 
1021 agggacatga gctggaggct gatggcatca gctgcagccc tgcaggggcc atgggtgccc 
1081 aggcttccca ggacctcgga gatgagttgc tggatgacgg ggaggatgag gaagatgaag 
1141 acgaggcctg gaaggccttc aacggtggct ggacggagat gcctgggatc ctgtggatgg 

12 01 agcctacgca gccgcctgac tttgccctgg cctatagacc gagcttccca gaggacagag 
1261 agccacagat accctacccg gagcccacct ggccaccccc gctcagtgcc cccagggtcc 
1321 cctaccactc ctcagtgctc tccgtcaccc ggcctgtggt ggtctctgcc acgcatccca 

13 81 cactgccttc tgcccaccag cctcctgtga tccctgccac acacccagct ttgtcccgtg 
1441 accaccagat ccccgtgatc gcagccaact atccagatct gccttctgcc taccaacccg 
1501 gtattctctc tgtctctcat tcagcacagc ctcctgccca ccagccccct atgatctcaa 
1561 ccaaatatcc ggagctcttc cctgcccacc agtcccccat gtttccagac acccgggtcg 
1621 ctggcaccca gaccaccact catttgcctg gaatcccacc taaccatgcc cctctggtca 
1681 ccaccctcgg tgcccagcta ccccctcaag ccccagatgc ccttgtcctc agaacccagg 
1741 ccacccagct tcccattatc ccaactgccc agccctctct gaccaccacc tccaggtccc 
1801 ctgtgtctcc tgcccatcaa atctctgtgc ctgctgccac ccagcccgca gccctcccca 
1861 ccctcctgcc ctctcagagc cccactaacc agacctcacc catcagccct acacatcccc 
1921 attccaaagc cccccaaatc ccaagggaag atggccccag tcccaagttg gccctgtggc 
1981 tgccctcacc agctcccaca gcagccccaa cagccctggg ggaggctggt cttgccgagc 
2 041 acagccagag ggatgaccgg tggctgctgg tggcactcct ggtgccaacg tgtgtctttt 
2101 tggtggtcct gcttgcactg ggcatcgtgt actgcacccg ctgtggcccc catgcaccca 
2161 acaagcgcat cactgactgc tatcgctggg tcatccatgc tgggagcaag agcccaacag 
2221 aacccatgcc ccccaggggc agcctcacag gggtgcagac ctgcagaacc agcgtgtgat 
2281 ggggtgcaga cccccctcat ggagtatggg gcgctggaca catggccggg gctgcaccag 
2341 ggacccatgg gggctgccca gctggacaga tggcttcctg ctccccaggc ccagccaggg 
2401 tcctctctca accactagac ttggctctca ggaactctgc ttcctggccc agcgctcgtg 
2461 accaaggata caccaaagcc cttaagacct cagggggcgg gtgctggggt cttctccaat 
2521 aaatggggtg tcaaccttaa aaaaaaaaaa aaaaaaaaaa aaaaa (SEQ ID NO: 101) 
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TEM1 (NM__020404) 
MLLRLLIjAWAZVAGPTLGQDPWAAEPRAACGPSSCYALFPRRRTF 

LEAWRACRELGGDLATPRTPEEAQRVDSLVGAGPASRLLWIGLQRQARQCQLQRPLRG 
FTWTTGDQDTAFTNWAQPASGGPCPAQRCVAIiEASGEHRWIiEGSCTLAVDGYLCQFGF 
EGACPALQDEAGQAGPAVYTTPFHLVSTEFEWLPFGSVAAVQCQAGRGASLLCVKQPE 
GGVGWSRAGPLCLGTGCSPDNGGCEHECVEEVDGHVSCRCTEGFRLAADGRSCEDPCA 
QAPCEQQCEPGGPQGYSCHCRLGFRPAEDDPHRCVDTDECQIAGVCQQMCVNYVGGFE 
CYCSEGHELEADGISCSPAGAMGAQASQDLGDELLDDGEDEEDEDEAWKAFNGGWTEM 
PGILWMEPTQPPDFALAYRPSFPEDREPQIPYPEPTWPPPLSAPRVPYHSSVLSVTRP 
VWSATHPTLPSAHQPPVIPATHPALSRDHQIPVIAANYPDLPSAYQPGILSVSHSAQ 
PPAHQPPMISTKYPELFPAHQSPMFPDTRVAGTQTTTHLPGIPPNHAPLVTTLGAQLP 
PQAPDALVLRTQATQLPIIPTAQPSLTTTSRSPVSPAHQISVPAATQPAALPTLLPSQ 
SPTNQTSPISPTHPHSKAPQIPREDGPSPKLALWLPSPAPTAAPTALGEAGLAEHSQR 
DDRWLLVALLVPTCVFLWLLALGIVYCTRCGPHAPNKRITDCYRWVIHAGSKSPTEP 
MPPRGSLTGVQTCRTSV (SEQ ID NO: 102) 
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Txe2 ligand2 (NM_001147) 

1 tgggttggtg tttatctcct cccagccttg agggagggaa caacactgta ggatctgggg 

61 agagaggaac aaaggaccgt gaaagctgct ctgtaaaagc tgacacagcc ctcccaagtg 

121 agcaggactg ttcttcccac tgcaatctga cagtttactg catgcctgga gagaacacag 

181 cagtaaaaac caggtttgct actggaaaaa gaggaaagag aagactttca ttgacggacc 

241 cagccatggc agcgtagcag ccctgcgttt cagacggcag cagctcggga ctctggacgt 

3 01 gtgtttgccc tcaagtttgc taagctgctg gtttattact gaagaaagaa tgtggcagat 

361 tgttttcttt actctgagct gtgatcttgt cttggccgca gcctataaca actttcggaa 

421 gagcatggac agcataggaa agaagcaata tcaggtccag catgggtcct gcagctacac 

481 tttcctcctg ccagagatgg acaactgccg ctcttcctcc agcccctacg tgtccaatgc 

541 tgtgcagagg gacgcgccgc tcgaatacga tgactcggtg cagaggctgc aagtgctgga 

601 gaacatcatg gaaaacaaca ctcagtggct aatgaagctt gagaattata tccaggacaa 

661 catgaagaaa gaaatggtag agatacagca gaatgcagta cagaaccaga cggctgtgat 

721 gatagaaata gggacaaacc tgttgaacca aacagctgag caaacgcgga agttaactga 

781 tgtggaagcc caagtattaa atcagaccac gagacttgaa cttcagctct tggaacactc 

841 cctctcgaca aacaaattgg aaaaacagat tttggaccag accagtgaaa taaacaaatt 

901 gcaagataag aacagtttcc tagaaaagaa ggtgctagct atggaagaca agcacatcat 

961 ccaactacag tcaataaaag aagagaaaga tcagctacag gtgttagtat ccaagcaaaa 

1021 ttccatcatt gaagaactag aaaaaaaaat agtgactgcc acggtgaata attcagttct 

1081 tcaaaagcag caacatgatc tcatggagac agttaataac ttactgacta tgatgtccac 

1141 atcaaactca gctaaggacc ccactgttgc taaagaagaa caaatcagct tcagagactg 

1201 tgctgaagta ttcaaatcag gacacaccac aaatggcatc tacacgttaa cattccctaa 

1261 ttctacagaa gagatcaagg cctactgtga catggaagct ggaggaggcg ggtggacaat 

1321 tattcagcga cgtgaggatg gcagcgttga ttttcagagg acttggaaag aatataaagt 

13 81 gggatttggt aacccttcag gagaatattg gctgggaaat gagtttgttt cgcaactgac 

1441 taatcagcaa cgctatgtgc ttaaaataca ccttaaagac tgggaaggga atgaggctta 

1501 ctcattgtat gaacatttct atctctcaag tgaagaactc aattatagga ttcaccttaa 

1561 aggacttaca gggacagccg gcaaaataag cagcatcagc caaccaggaa atgattttag 

1621 cacaaaggat ggagacaacg acaaatgtat ttgcaaatgt tcacaaatgc taacaggagg 

1681 ctggtggttt gatgcatgtg gtccttccaa cttgaacgga atgtactatc cacagaggca 

1741 gaacacaaat aagttcaacg gcattaaatg gtactactgg aaaggctcag gctattcgct 

1801 caaggccaca accatgatga tccgaccagc agatttctaa acatcccagt ccacctgagg 

1861 aactgtctcg aactattttc aaagacttaa gcccagtgca ctgaaagtca cggctgcgca 

1921 ctgtgtcctc ttccaccaca gagggcgtgt gctcggtgct gacgggaccc acatgctcca 

19 81 gattagagcc tgtaaacttt atcacttaaa cttgcatcac ttaacggacc aaagcaagac 

2041 cctaaacatc cataattgtg attagacaga acacctatgc aaagatgaac ccgaggctga 

2101 gaatcagact gacagtttac agacgctgct gtcacaacca agaatgttat gtgcaagttt 

2161 atcagtaaat aactggaaaa cagaacactt atgttataca atacagatca tcttggaact 

2221 gcattcttct gagcactgtt tatacactgt gtaaataccc atatgtcct (SEQ ID 
NO:103) 
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Tie2 ligand2 (NM_001147) 
MWQIVFFTLSCDLVLAAAYNNFRKSMDSIGKKQYQVQHGSCSYT 

FLLPEMDNCRSSSSPYVSNAVQRDAPLEYDDSVQRLQVLENIME3SrNTQWLM3^SNYIQ 
DNMKKEMVE I QQNAVQNQTAVMI E I GTNLLNQTAEQTRKLTDVE AQVLNQTTRLELQL 
LEHSLSTNKLEKQILDQTSEINKLQDKNSFLEKKVLAMEDKHIIQLQSIKEEKDQLQV 
LVSKQNS 1 1 EELEKKI VTATVNNS VLQKQQHDLMETVWNLLTMMSTSNSAKDPTVAKE 
EQISFRDCAEVFKSGHTTNGIYTLTFPNSTEEIKAYCDMEAGGGGWTIIQRREDGSVD 
FQRTWKEYKVGFGNPSGEYWLGNEFVSQLTNQQRYVLKIHLKDWEGNEAYSLYEHFYL 
S SEELNYRI HLKGLTGTAGKI SSI SQPGNDFS TKDGDNDKC I CKCS QMLTGGWWFDAC 
GPSNIiNGMYYPQRQNTNKFNGI KWYYWKGSGYSLKATTMMI RPADF (SEQ ID NO: 104) 
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VEGFC (NM__00542 9) 

1 cggggaaggg gagggaggag ggggacgagg gctctggcgg gtttggaggg gctgaacatc 
61 gcggggtgtt ctggtgtccc ccgccccgcc tctccaaaaa gctacaccga cgcggaccgc 
121 ggcggcgtcc tccctcgccc tcgcttcacc tcgcgggctc cgaatgcggg gagctcggat 
181 gtccggtttc ctgtgaggct tttacctgac acccgccgcc tttccccggc actggctggg 
241 agggcgccct gcaaagttgg gaacgcggag ccccggaccc gctcccgccg cctccggctc 
3 01 gcccaggggg ggtcgccggg aggagcccgg gggagaggga ccaggagggg cccgcggcct 
3 61 cgcaggggcg cccgcgcccc cacccctgcc cccgccagcg gaccggtccc ccacccccgg 
421 tccttcca.cc atgcacttgc tgggcttctt ctctgtggcg tgttctctgc tcgccgctgc 
481 gctgctcccg ggtcctcgcg aggcgcccgc cgccgccgcc gccttcgagt ccggactcga 
541 cctctcggac gcggagcccg acgcgggcga ggccacggct tatgcaagca aagatctgga 
601 ggagcagtta cggtctgtgt ccagtgtaga tgaactcatg actgtactct acccagaata 
6 61 ttggaaaatg tacaagtgtc agctaaggaa aggaggctgg caacataaca gagaacaggc 
721 caacctcaac tcaaggacag aagagactat aaaatttgct gcagcacatt ataatacaga 
781 gatcttgaaa agtattgata atgagtggag aaagactcaa tgcatgccac gggaggtgtg 
841 tatagatgtg gggaaggagt ttggagtcgc gacaaacacc ttctttaaac ctccatgtgt 
9 01 gtccgtctac agatgtgggg gttgctgcaa tagtgagggg ctgcagtgca tgaacaccag 
961 cacgagctac ctcagcaaga cgttatttga aattacagtg cctctctctc aaggccccaa 
1021 accagtaaca atcagttttg ccaatcacac fctcctgccga tgcatgtcta aactggatgt 
1081 ttacagacaa gttcattcca ttattagacg ttccctgcca gcaacactac cacagtgtca 
1141 ggcagcgaac aagacctgcc ccaccaatta catgtggaat aatcacatct gcagatgcct 
12 01 ggctcaggaa gattttatgt tttcctcgga tgctggagat gactcaacag atggattcca 

12 61 tgacatctgt ggaccaaaca aggagctgga tgaagagacc tgtcagtgtg tctgcagagc 
1321 ggggcttcgg cctgccagct gtggacccca caaagaacta gacagaaact catgccagtg 

13 81 tgtctgtaaa aacaaactct tccccagcca atgtggggcc aaccgagaat ttgatgaaaa 
1441 cacatgccag tgtgtatgta aaagaacctg ccccagaaat caacccctaa atcctggaaa 
15 01 atgtgcctgt gaatgtacag aaagtccaca gaaatgcttg ttaaaaggaa agaagttcca 
15 61 ccaccaaaca tgcagctgtt acagacggcc atgtacgaac cgccagaagg cttgtgagcc 
1621 aggattttca tatagtgaag aagtgtgtcg ttgtgtccct tcatattgga aaagaccaca 
1681 aatgagctaa gattgtactg ttttccagtt catcgatttt ctattatgga aaactgtgtt 
1741 gccacagtag aactgtctgt gaacagagag acccttgtgg gtccatgcta acaaagacaa 
18 01 aagtctgtct ttcctgaacc atgtggataa ctttacagaa atggactgga gctcatctgc 
1861 aaaaggcctc ttgtaaagac tggttttctg ccaatgacca aacagccaag attttcctct 
1921 tgtgatttct ttaaaagaat gactatataa tttatttcca ctaaaaatat tgtttctgca 
1981 ttcattttta tagcaacaac aattggtaaa actcactgtg atcaatattt ttatatcatg 
2041 caaaatatgt ttaaaataaa atgaaaattg tattat (SEQ ID NO: 105) 
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VEGFC (NM_0 0542 9) 
MHLLGFFSVACSLLAAALLPGPREAPAAAAAFESGLDIiSDAEPD 

AGEATAYASKDLEEQLRSVSSVDELMTVLYPEYWKMYKCQLRKGGWQHNREQANLNSR 
TEETI KFAAAHYNTEILKS IDNEWRKTQCMPREVCIDVGKEFGVATNTFFKPPCVSVY 
RCGGCCNSEGLQCMNTSTSYLSKTLFEITVPLSQGPKPVTISFANHTSCRCMSKLDVY 
RQVHSIIRRSLPATLPQCQAANKTCPTlSr^MWNNHICRCLAQEDFMFSSDAGDDSTDGF 
HDICGPNKELDEETCQCVCRAGLRPASCGPHKELDRNSCQCVCKNKLFPSQCGAISrREF 
DENTCQCVCKRTCPRNQPLNPGKCACECTESPQKCLLKGKKFHHQTCSCYRRPCTNRQ 
KACEPGFSYSEEVCRCVPSYWKRPQMS (SEQ ID NO: 106) 
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tPA(NM_00 0 93 0) 

1 atggccctgt ccactgagca tcctcccgcc acacagaaac ccgcccagcc ggggccaccg 
61 accccacccc ctgcctggaa acttaaggag gccggagctg tggggagctc agagctgaga 
121 tcctacagga gtccagggct ggagagaaaa cctctgcgag gaaagggaag gagcaagccg 
181 tgaatttaag ggacgctgtg aagcaatcat ggatgcaatg aagagagggc tctgctgtgt 
241 gctgctgctg tgtggagcag tcttcgtttc gcccagccag gaaatccatg cccgattcag 
3 01 aagaggagcc agatcttacc aagtgatctg cagagatgaa aaaacgcaga tgatatacca 
3 61 gcaacatcag tcatggctgc gccctgtgct cagaagcaac cgggtggaat attgctggtg 
421 caacagtggc agggcacagt gccactcagt gcctgtcaaa agttgcagcg agccaaggtg 
481 tttcaacggg ggcacctgcc agcaggccct gtacttctca gatttcgtgt gccagtgccc 
541 cgaaggattt gctgggaagt gctgtgaaat agataccagg gccacgtgct acgaggacca 
6 01 gggcatcagc tacaggggca cgtggagcac agcggagagt ggcgccgagt gcaccaactg 
6 61 gaacagcagc gcgttggccc agaagcccta cagcgggcgg aggccagacg ccatcaggct 
721 gggcctgggg aaccacaact actgcagaaa cccagatcga gactcaaagc cctggtgcta 
781 cgtctttaag gcggggaagt acagctcaga gttctgcagc acccctgcct gctctgaggg 
841 aaacagtgac tgctactttg ggaatgggtc agcctaccgt ggcacgcaca gcctcaccga 
9 01 gtcgggtgcc tcctgcctcc cgtggaattc catgatcctg ataggcaagg tttacacagc 
961 acagaacccc agtgcccagg cactgggcct gggcaaacat aattactgcc ggaatcctga 
1021 tggggatgcc aagccctggt gccacgtgct gaagaaccgc aggctgacgt gggagtactg 
1081 tgatgtgccc tcctgctcca cctgcggcct gagacagtac agccagcctc agtttcgcat 
1141 caaaggaggg ctcttcgccg acatcgcctc ccacccctgg caggctgcca tctttgccaa 

12 01 gcacaggagg tcgcccggag agcggttcct gtgcgggggc atactcatca gctcctgctg 
1261 gattctctct gccgcccact gcttccagga gaggtttccg ccccaccacc tgacggtgat 

13 21 cttgggcaga acataccggg tggtccctgg cgaggaggag cagaaatttg aagtcgaaaa 
13 81 atacattgtc cataaggaat tcgatgatga cacttacgac aatgacattg cgctgctgca 
1441 gctgaaatcg gattcgtccc gctgtgccca ggagagcagc gtggtccgca ctgtgtgcct 
1501 tcccccggcg gacctgcagc tgccggactg gacggagtgt gagctctccg gctacggcaa 
15 61 gcatgaggcc ttgtctcctt tctattcgga gcggctgaag gaggctcatg tcagactgta 
1621 cccatccagc cgctgcacat cacaacattt acttaacaga acagtcaccg acaacatgct 
1681 gtgtgctgga gacactcgga gcggcgggcc ccaggcaaac ttgcacgacg cctgccaggg 
1741 cgattcggga ggccccctgg tgtgtctgaa cgatggccgc atgactttgg tgggcatcat 
18 01 cagctggggc ctgggctgtg gacagaagga tgtcccgggt gtgtacacca aggttaccaa 

18 61 ctacctagac tggattcgtg acaacatgcg accgtgacca ggaacacccg actcctcaaa 
1921 agcaaatgag atcccgcctc ttcttcttca gaagacactg caaaggcgca gtgcttctct 

19 81 acagacttct ccagacccac cacaccgcag aagcgggacg agaccctaca ggagagggaa 
2041 gagtgcattt tcccagatac ttcccatttt ggaagttttc aggacttggt ctgatttcag 
2101 gatactctgt cagatgggaa gacatgaatg cacactagcc tctccaggaa tgcctcctcc 
2161 ctgggcagaa agtggccatg ccaccctgtt ttcagctaaa gcccaacctc ctgacctgtc 
2221 accgtgagca gctttggaaa caggaccaca aaaatgaaag catgtctcaa tagtaaaaga 
22 81 taacaagatc tttcaggaaa gacggattgc attagaaata gacagtatat ttatagtcac 
2341 aagagcccag cagggcctca aagttggggc aggctggctg gcccgtcatg ttcctcaaaa 
2401 gcacccttga cgtcaagtct ccttcccctt tccccactcc ctggctctca gaaggtattc 
2461 cttttgtgta cagtgtgtaa agtgtaaatc ctttttcttt ataaacttta gagtagcatg 
2521 agagaattgt atcatttgaa caactaggct tcagcatatt tatagcaatc catgttagtt 
2581 tttactttct gttgccacaa ccctgtttta tactgtactt aataaattca gatatatttt 
2641 tcacagtttt tec (SEQ ID NO: 107) 
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tPA(NM_0 0093 0) 
MDAMKRGLCCVLLLCGAVFVSPSQEIHARFRRGARSYQVICRDE 

KTQMIYQQHQSWLRPVLRSNRVEYCWCNSGRAQCHSVPVKSCSEPRCFNGGTCQQALY 
FSDFVCQCPEGFAGKCCEIDTRATCYEDQGISYRGTWSTAESGAECTNWNSSALAQKP 
YSGRRPDAIRLGLGNHNYCRNPDRDSKPWCYVFKAGKYSSEFCSTPACSEGNSDCYFG 
NGSAYRGTHSLTESGASCLPWNSMILIGKVYTAQNPSAQALGIiGKHNYCRNPDGDAKP 
WCHVLKNRRLTWEYCDVPSCSTCGLRQYSQPQFRIKGGLFADIASHPWQAAIFAKHRR 
S PGERFLCGG I L I S S CWI LS AAHCFQERF P PHHLTVI LGRT YR WPGEE EQKFEVEKY 
IVHKEFDDDTYDNDIALLQLKSDSSRCAQESSWRTVCLPPADLQLPDWTECELSGYG 
KHEAIiSPFYSERLKEAHVRLYPSSRCTSQHLLNRTVTDNMLCAGDTRSGGPQANLHDA 
CQGDSGGPLVCLNDGRMTLVGIISWGLGCGQKDVPGVYT1OTTOTLDWIRDNMRP (SEQ ID NO: 108) 
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Thr ombomodul in (NM__0 003 61) 

1 cttgcaatcc aggctttcct tggaagtggc tgtaacatgt atgaaaagaa agaaaggagg 

61 accaagagat gaaagagggc tgcacgcgtg ggggcccgag tggtgggcgg ggacagtcgt 

121 cttgttacag gggtgctggc cttccctggc gcctgcccct gtcggccccg cccgagaacc 

181 tccctgcgcc agggcagggt ttactcatcc cggcgaggtg atcccatgcg cgagggcggg 

241 cgcaagggcg gccagagaac ccagcaatcc gagtatgcgg catcagccct tcccaccagg 

3 01 cacttccttc cttttcccga acgtccaggg agggagggcc gggcacttat aaactcgagc 

3 61 cctggccgat ccgcatgtca gaggctgcct cgcaggggct gcgcgcacgg caagaagtgt 

421 ctgggctggg acggacagga gaggctgtcg ccatcggcgt cctgtgcccc tctgctccgg 

481 cacggccctg tcgcagtgcc cgcgctttcc ccggcgcctg cacgcggcgc gcctgggtaa 

541 catgcttggg gtcctggtcc ttggcgcgct ggccctggcc ggcctggggt tccccgcacc 

6 01 cgcagagccg cagccgggtg gcagccagtg cgtcgagcac gactgcttcg cgctctaccc 

661 gggccccgcg accttcctca atgccagtca gatctgcgac ggactgcggg gccacctaat 

721 gacagtgcgc tcctcggtgg ctgccgatgt catttccttg ctactgaacg gcgacggcgg 

781 cgttggccgc cggcgcctct ggatcggcct gcagctgcca cccggctgcg gcgaccccaa 

841 gcgcctcggg cccctgcgcg gcttccagtg ggttacggga gacaacaaca ccagctatag 

901 caggtgggca cggctcgacc tcaatggggc tcccctctgc ggcccgttgt gcgtcgctgt 

961 ctccgctgct gaggccactg tgcccagcga gccgatctgg gaggagcagc agtgcgaagt 

1021 gaaggccgat ggcttcctct gcgagttcca cttcccagcc acctgcaggc cactggctgt 

10 81 ggagcccggc gccgcggctg ccgccgtctc gatcacctac ggcaccccgt tcgcggcccg 

1141 cggagcggac ttccaggcgc tgccggtggg cagctccgcc gcggtggctc ccctcggctt 

12 01 acagctaatg tgcaccgcgc cgcccggagc ggtccagggg cactgggcca gggaggcgcc 

12 61 gggcgcttgg gactgcagcg tggagaacgg cggctgcgag cacgcgtgca atgcgatccc 
1321 tggggctccc cgctgccagt gcccagccgg cgccgccctg caggcagacg ggcgctcctg 

13 81 caccgcatcc gcgacgcagt cctgcaacga cctctgcgag cacttctgcg ttcccaaccc 
1441 cgaccagccg ggctcctact cgtgcatgtg cgagaccggc taccggctgg cggccgacca 
15 01 acaccggtgc gaggacgtgg atgactgcat actggagccc agtccgtgtc cgcagcgctg 
15 61 tgtcaacaca cagggtggct tcgagtgcca ctgctaccct aactacgacc tggtggacgg 
1621 cgagtgtgtg gagcccgtgg acccgtgctt cagagccaac tgcgagtacc agtgccagcc 
1681 cctgaaccaa actagctacc tctgcgtctg cgccgagggc ttcgcgccca ttccccacga 
1741 gccgcacagg tgccagatgt tttgcaacca gactgcctgt ccagccgact gcgaccccaa 

18 01 cacccaggct agctgtgagt gccctgaagg ctacatcctg gacgacggtt tcatctgcac 
1861 ggacatcgac gagtgcgaaa acggcggctt ctgctccggg gtgtgccaca acctccccgg 
1921 taccttcgag tgcatctgcg ggcccgactc ggcccttgcc cgccacattg gcaccgactg 

19 81 tgactccggc aaggtggacg gtggcgacag cggctctggc gagcccccgc ccagcccgac 
2 041 gcccggctcc accttgactc ctccggccgt ggggctcgtg cattcgggct tgctcatagg 
2101 catctccatc gcgagcctgt gcctggtggt ggcgcttttg gcgctcctct gccacctgcg 
2161 caagaagcag ggcgccgcca gggccaagat ggagtacaag tgcgcggccc cttccaagga 
2221 ggtagtgctg cagcacgtgc ggaccgagcg gacgccgcag agactctgag cggcctccgt 
22 81 ccaggagcct ggctccgtcc aggagctgtg cctcctcacc cccagctttg ctaccaaagc 
2341 accttagctg gcattacagc tggagaagac cctccccgca ccccccaagc tgttttcttc 
24 01 tattccatgg ctaactggcg agggggtgat tagagggagg agaatgagcc tcggcctctt 
2461 ccgtgacgtc actggaccac tgggcaatga tggcaatttt gtaacgaaga cacagactgc 
2521 gatttgtccc aggtcctcac taccgggcgc aggagggtga gcgttattgg tcggcagcct 
2581 tctgggcaga ccttgacctc gtgggctagg gatgactaaa atatttattt tttttaagta 
2641 tttaggtttt tgtttgtttc ctttgttctt acctgtatgt ctccagtatc cactttgcac 
2701 agctctccgg tctctctctc tctacaaact cccacttgtc atgtgacagg taaactatct 
2761 tggtgaattt ttttttccta gccctctcac atttatgaag caagccccac ttattcccca 
2821 ttcttcctag ttttctcctc ccaggaactg ggccaactca cctgagtcac cctacctgtg 
2881 cctgacccta cttcttttgc tcatctagct gtctgctcag acagaacccc tacatgaaac 

2 941 agaaacaaaa acactaaaaa taaaaatggc catttgcttt ttcaccagat ttgctaattt 

3 0 01 atcctgaaat ttcagattcc cagagcaaaa taattttaaa caaagggttg agatgtaaaa 
3 0 61 ggtattaaat tgatgttgct ggactgtcat agaaattaca cccaaagagg tatttatctt 
3121 tacttttaaa cagtgagcct gaattttgtt gctgttttga tttgtactga aaaatggtaa 
3181 ttgttgctaa tcttcttatg caatttcctt ttttgttatt attacttatt tttgacagtg 
3241 ttgaaaatgt tcagaaggtt gctctagatt gagagaagag acaaacacct cccaggagac 
33 01 agttcaagaa agcttcaaac tgcatgattc atgccaatta gcaattgact gtcactgttc 
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3361 
3421 
3481 
3541 
3601 
3661 
3721 
3781 
3841 
3901 
3961 
4021 



cttgtcactg 
ggaatggatc 
tctaccattt 
tgcccatggg 
aatctatatt 
tccagactgc 
caagtcaggc 
gtagaaaagg 
ttcagctaag 
tgtaactttt 
atagttattt 
acttgtacaa 



gtagaccaaa 
ctggaggatg 
cagagaggcc 
agctggttag 
taacaagatc 
ttccaatttt 
ccttattttc 
ctaggtacac 
ctaggaatga 
gtaagacaaa 
atttattgga 
aataaacaaa 



ataaaaccag 
cccaattagg 
ttttggaatg 
aaatgcagaa 
tgcagggggt 
ctggaataca 
aagaaactga 
agctctagac 
aatcctgctt 
ggttttcctc 
gataatctag 
taacaatgtg 



ctctactggt 
gcctagcctt 
tggcccctga 
tcctaggctc 
gtgtctgctc 
tgaaatatag 
ggaattttct 
actgccacac 
cagtgtatgg 
ttctattttg 
aacacaggca 
(SEQ ID NO: 



cttgtggaat 
aatcaggtcc 
acaagaattg 
ca.cccca.toc 
agtaatttga 
atcagttata 
ttgtgtagct 
agggtctgca 
aaataaatgt 
taaactcaaa 
aaatccttgc 
109) 



tgggagcttg 
tcagagaatt 
gaagctgccc 
agttcatgag 
ggacaaccat 
agtagcaggc 
ttgctctttg 
aggtctttgg 
atcatagaaa 
atatttgtac 
ttatgacatc 
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Thx ombomodul in (NM_0 003 61) 
MLGVLVLGALALAGLGFPAPAEPQPGGSQCVEHDCFALYPGPAT 

FLNASQICDGLRGHLMTVRSSVAADVISLLLNGDGGVGRRRLWIGLQLPPGCGDPKRL 
GPLRGFQWVTGD3STNTSYSRWARLDLNGAPLCGPLCVAVSAAEATVPSEPIWEEQQCEV 
KADGFLCEFHFPATCRPLAVEPGAAAAAVSITYGTPFAARGADFQALPVGSSAAVAPL 
GLQLMCTAPPGAVQGHWAREAPGAWDCSVENGGCEHACNAIPGAPRCQCPAGAALQAD 
GRSCTASATQSCNDLCEHFCVPNPDQPGSYSCMCETGYRLAADQHRCEDVDDCILEPS 
PCPQRCWTQGGFECHCYPNYDLVDGECVEPVDPCFRANCEYQCQPLNQTSYLCVCAE 
GFAPIPHEPHRCQMFCNQTACPADCDPNTQASCECPEGYILDDGFICTDIDECENGGF 
CSGVCHNLPGTFECICGPDSALARHIGTDCDSGKVDGGDSGSGEPPPSPTPGSTLTPP 
AVGLWSGLLIGISIASLCLWALLALLCHLRKKQGAARAKMEYKCAAPSKEVVLQHV 
RTERTPQRL {SEQ ID NO: 110) 
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TF (NM_001993) 

1 aagactgcga gctccccgca ccccctcgca ctccctctgg ccggcccagg gcgccttcag 

61 cccaacctcc ccagccccac gggcgccacg gaacccgctc gatctcgccg ccaactggta 

121 gacatggaga cccctgcctg gccccgggtc ccgcgccccg agaccgccgt cgctcggacg 

181 ctcctgctcg gctgggtctt cgcccaggtg gccggcgctt caggcactac aaatactgtg 

241 gcagcatata atttaacttg gaaatcaact aatttcaaga caattttgga gtgggaaccc 

301 aaacccgtca atcaagtcta cactgttcaa ataagcacta agtcaggaga ttggaaaagc 

3 61 aaatgctttt acacaacaga cacagagtgt gacctcaccg acgagattgt gaaggatgtg 

421 aagcagacgt acttggcacg ggtcttct cc tacccggcag ggaatgtgga gagcaccggt 

481 tctgctgggg agcctctgta tgagaactcc ccagagttca caccttacct ggagacaaac 

541 ctcggacagc caacaattca gagttttgaa caggtgggaa caaaagtgaa tgtgaccgta 

601 gaagatgaac ggactttagt cagaaggaac aacactttcc taagcctccg ggatgttttt 

661 ggcaaggact taatttatac actttattat tggaaatctt caagttcagg aaagaaaaca 

721 gccaaaacaa acactaatga gtttttgatt gatgtggata aaggagaaaa ctactgtttc 

781 agtgttcaag cagtgattcc ctcccgaaca gttaaccgga agagtacaga cagcccggta 

841 gagtgtatgg gccaggagaa aggggaattc agagaaatat tctacatcat tggagctgtg 

901 gtatttgtgg tcatcatcct tgtcatcatc ctggctatat ctctacacaa gtgtagaaag 

961 gcaggagtgg ggcagagctg gaaggagaac tccccactga atgtttcata aaggaagcac 

1021 tgttggagct actgcaaatg ctatattgca ctgtgaccga gaacttttaa gaggatagaa 

1081 tacatggaaa cgcaaatgag tatttcggag catgaagacc ctggagttca aaaaactctt 

1141 gatatgacct gttattacca ttagcattct ggttttgaca tcagcattag tcactttgaa 

1201 atgtaacgaa tggtactaca accaattcca agttttaatt tttaacacca tggcaccttt 

1261 tgcacataac atgctttaga ttatatattc cgcacttaag gattaaccag gtcgtccaag 

1321 caaaaacaaa tgggaaaatg tcttaaaaaa tcctgggtgg acttttgaaa agcttttttt 

13 81 tttttttttt tttgagacgg agtcttgctc tgttgcccag gctggagtgc agtagcacga 

1441 tctcggctca cttgcaccct ccgtctctcg ggttcaagca attgtctgcc tcagcctccc 

1501 gagtagctgg gattacaggt gcgcactacc acgccaagct aatttttgta ttttttagta 

1561 gagatggggt ttcaccatct tggccaggct ggtcttgaat tcctgacctc agtgatccac 

1621 ccaccttggc ctcccaaaga tgctagtatt atgggcgtga accaccatgc ccagccgaaa 

16 81 agcttttgag gggctgactt caatccatgt aggaaagtaa aatggaagga aattgggtgc 

1741 atttctagga cttttctaac atatgtctat aatatagtgt ttaggttctt ttttttttca 

1801 ggaatacatt tggaaattca aaacaattgg gcaaactttg tattaatgtg ttaagtgcag 

1861 gagacattgg tattctgggc agcttcctaa tatgctttac aatctgcact ttaactgact 

1921 taagtggcat taaacatttg agagctaact atatttttat aagactacta tacaaactac 

1981 agagtttatg atttaaggta cttaaagctt ctatggttga cattgtatat ataatttttt 

2041 aaaaaggttt ttctatatgg ggattttcta tttatgtagg taatattgtt ctatttgtat 

2101 atattgagat aatttattta atatacttta aataaaggtg actgggaatt gtt (SEQ ID 
NO: 111) 
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TF (NM_001993) 
METPAWPRVPRPETAVARTLLLGWVFAQVAGASGTTNTVAAYNL 

TWKSTNFKTILEWEPKPVNQVYTVQISTKSGDWKSKCFYTTDTECDLTDEIVKDVKQT 
YIiARVFSYPAGNVESTGSAGEPIiYENSPEFTPYLETNLGQPTIQSFEQVGTKVTWTVE 
DERTIiVRRNNTFLSLRDVFGKDLIYTLYYWKSSSSGKKTAKTNTNEFLIDVDKGENYC 
F S VQ AV IPS RT VNRKS TD S P VE CMGQE KGE FREIFYII GA WF WI I L VI I LA I S LHK 
CRKAG VGQ S WKENS PLNVS (SEQ ID NO: 112) 
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GPR4 (NM_005282) 

1 ctggtgacct tacttatctc tgttgctttc tggggtccta ggaaatgcca gcactcccac 
61 ccacattgcc tgaactttcc aacactccct agctgcgctg tgtcctatct caacacttcc 
121 tcatgtattt cttgtgtctt ctagaacatt cccccgccat tattacttca atatggctac 
181 acatacttcc taattgccct gcaaaccatc tccttctcac cattgcccag cgatgctttc 
241 gtctcctcca taaacactcc cggagaccaa tttttgtgtc acccccatac tccctcgttg 
3 01 acacactgac tccatacata acctccttga aaaacctctt tattaatctc accatcctcc 
361 agacttccct cctgtcataa ttccatccct cctccaactt ttccctctca agctctgccc 
421 ttcccagccc agcccagcct acccaacctc atctcttccc tgtagaccac atcccaccat 
481 gttcccctga gcctccaagg aaggggctca gggggcccca tggcctcccg ctccctgtgg 
541 ccccacagcc cccgtgggcc aggggaagcg ccccagaagc cgaagtgccc accatgggca 
601 accacacgtg ggagggctgc cacgtggact cgcgcgtgga ccacctcttt ccgccatccc 
661 tctacatctt tgtcatcggc gtggggctgc ccaccaactg cctggctctg tgggcggcct 
721 accgccaggt gcaacagcgc aacgagctgg gcgtctacct gatgaacctc agcatcgccg 
781 acctgctgta catctgcacg ctgccgctgt gggtggacta cttcctgcac cacgacaact 
841 ggatccacgg ccccgggtcc tgcaagctct ttgggttcat cttctacacc aatatctaca 
901 tcagcatcgc cttcctgtgc tgcatctcgg tggaccgcta cctggctgtg gcccacccac 
961 tccgcttcgc ccgcctgcgc cgcgtcaaga ccgccgtggc cgtgagctcc gtggtctggg 
1021 ccacggagct gggcgccaac tcggcgcccc tgttccatga cgagctcttc cgagaccgct 
1081 acaaccacac cttctgcttt gagaagttcc ccatggaagg ctgggtggcc tggatgaacc 
1141 tctatcgggt gttcgtgggc tt octet tec cgtgggcgct catgetgetg tegtaceggg 
12 01 gcatcctgcg ggccgtgcgg ggcagcgtgt ccaccgagcg ccaggagaag gecaagatea 
1261 ageggctgge cctcagcctc atcgccatcg tgctggtctg ctttgcgccc tatcacgtgc 
1321 tcttgctgtc ccgcagcgcc atctacctgg gccgcccctg ggactgegge ttcgaggagc 
1381 gegtctttte tgcataccac agctcactgg ctttcaccag cctcaactgt gtggcggacc 
1441 ccatcctcta ctgcctggtc aacgagggcg cccgcagcga tgtggccaag gccctgcaca 
1501 acctgctccg ctttctggcc agegacaage cccaggagat ggccaatgcc tcgctcaccc 
1561 tggagacccc actcacctcc aagaggaaca gcacagccaa agecatgact ggcagctggg 
1621 cggccactcc gccctcccag ggggaccagg tgcagctgaa gatgetgecg ccagcacaat 
1681 gaaccccgag tggcacagaa tccccagttt tcccctctca tcccacagtc ccttctctcc 
1741 tggtctggtg tatgeaaatt gtatggaaaa agggctgtgt taatattcat aagaatacaa 
1801 gaacttagga agagtgaggt tggtgtgtca ctggtcaacc tttgtgctcc cagatcccat 
1861 cacagtttgg cgattgtgga gggcctcctg aaggaggaga tgagtaaata tatttttttg 
1921 gagacagggt ctcactgtgt tgcccaggct ggagtgcagt agtgcagtcg tggctcactg 
19 81 cagcctccac ctcctgggct ctccagcgat cttcccacat cagcctcccg agtagctggg 
2 041 accacaaatg tgagcccacc catgcctggc taatttttgt actttttgta taaatggagt 
2101 ctcactatgt ttccccaggc tgatcttgaa ctcctgggct caagagatcc tcctgccttg 
2161 gcctcccaaa gtgetcagat tagagatgtg agccgccatg tetggecaga taaattaagt 
2221 caaacatttg gtttccagaa aataaagaca aatagagaag gttagatttt tttttttcca 
22 81 acaagtggat aaaagtctgt gaeteggggg aaagtggaag gagaaatgea gecgatatag 
2341 agtcattatg tttgeaaage ccctggtcat acaggecagg gaacataaga ccgcaattct 
24 01 aagtttctag ataaacagcg atctccaagt caagactgag gatgaagagg gagaatgtca 
2461 gaactcaagt gaagggcaat cagggcagac tgcctggagg agtgatgcca gaaggtttgg 
2521 gaagaaggtg tgggacaaga agaaagggta tttattcatt cattcaacag aggtttatgt 
2581 agggcactgt gctgggtggg gctggggaca caacaatgac tgaggcagee tggccttgcc 
2641 ttcacagggc tcaccataca caagtaaata aaaaatatgt aatgtttgga attget (SEQ 
ID NO: 113) 
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GPR4 (NM_0052 82) 
MGNHTWEGCHVD SRVDHLF P P SLY I FVI GVGL PTNCLALWAAYR 

QVQQRNELGWLMNLSIADLLYICTLPLWVDYFLHHDNWIHGPGSCKIiFGFIFYTNIY 
I S IAFLCCI SVDRYLAVAHPLRFARLRRVKTAVAVS SWWATELGANS APLFHDELFR 
DRYNHTFCFEKFPMEGWAWNOSfLYRVFVGFLFPWALMLLSYRGILRAVRGSVSTERQE 
KAKIKRLALSLIAIVLVCFAPYHVLLLSRSAIYLGRPWDCGFEERVFSAYHSSLAFTS 
LNCVADPILYCLVNEGARSDVAKALHNLLR 

AKAMTGSWAATPPSQGDQVQLKMLPPAQ (SEQ ID NO: 114) 
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GPR66 (NMJD06056) 

1 agcggggggt tcccggccgg acaggcgggg cgtcggggcg cgggctgggg ccgctgtcag 
61 tcagtccact ggctcccgcg ccgcgtctgt gtccgtcgct cggagggtgg aagccggggt 
121 ctcgcgggcc gcgggccgca tgactcctct ctgcctcaat tgctctgtcc tccctggaga 
181 cctgtaccca gggggtgcaa ggaaccccat ggcttgcaat ggcagtgcgg ccagggggca 
241 ctttgaccct gaggacttga acctgactga cgaggcactg agactcaagt acctggggcc 
3 01 ccagcagaca gagctgttca tgcccatctg tgccacatac ctgctgatct tcgtggtggg 

3 61 cgctgtgggc aatgggctga cctgtctggt catcctgcgc cacaaggcca tgcgcacgcc 
421 taccaactac tacctcttca gcctggccgt gtcggacctg ctggtgctgc tggtgggcct 

4 81 gcccctggag ctctatgaga tgtggcacaa ctaccccttc ctgctgggcg ttggtggctg 
541 ctatttccgc acgctactgt ttgagatggt ctgcctggcc tcagtgctca acgtcactgc 
6 01 cctgagcgtg gaacgctatg tggccgtggt gcacccactc caggccaggt ccatggtgac 
661 gcgggcccat gtgcgccgag tgcttggggc cgtctggggt cttgccatgc tctgctccct 
721 gcccaacacc agcctgcacg gcatccagca gctgcacgtg ccctgccggg gcccagtgcc 
781 agactcagct gtttgcatgc tggtccgccc acgggccctc tacaacatgg tagtgcagac 
841 caccgcgctg ctcttcttct gcctgcccat ggccatcatg agcgtgctct acctgctcat 
9 01 tgggctgcga ctgcggcggg agaggctgct gctcatgcag gaggccaagg gcaggggctc 
961 tgcagcagcc aggtccagat acacctgcag gctccagcag cacgatcggg gccggagaca 

1021 agtgaccaag atgctgtttg tcctggtcgt ggtgtttggc atctgctggg ccccgttcca 
10 81 cgccgaccgc gtcatgtgga gcgtcgtgtc acagtggaca gatggcctgc acctggcctt 
1141 ccagcacgtg cacgtcatct ccggcatctt cttctacctg ggctcggcgg ccaaccccgt 

12 01 gctctatagc ctcatgtcca gccgcttccg agagaccttc caggaggccc tgtgcctcgg 
1261 ggcctgctgc catcgcctca gaccccgcca cagctcccac agcctcagca ggatgaccac 
1321 aggcagcacc ctgtgtgatg tgggctccct gggcagctgg gtccaccccc tggctgggaa 

13 81 cgatggccca gaggcgcagc aagagaccga tccatcctga gtggagcctt aaagtggctt 
1441 cacctggagg ggccagaggg tcacctggag ctggggagac acatctgcct tcctctgcag 
1501 ggatccttca cgtactgtcc ctagttcagc ctagaaattc tgaccagcac ctcagtttcc 
1561 ctcagaggga aacagcagga ggagggatcc ctgactgctg aggactcaca ctgaccagac 
1621 gccacacctt gtgcttctta tctgtccact gccactcccc cagttcaaat ccttaccctg 
16 81 cagaaatatc acagttagct ggggctcagc agtcctccct ctggggactc cctgccacca 
1741 ctgccagttt ctgaaacggt cccactgggt cctcactgtc cttcccagtt cctgttcagg 
18 01 ttctggcagg ggcccaggga tccaggggac ctggttccaa tctcagccct gctgt caeca 
1861 ccttgtcatg caccatcaag catatcagtc tacctttctt tttttctgag acagagtctc 
1921 actctgtcgc ccaggctaga gtgcagtggc gcgattttgg ctcactgcaa cctccgcctc 
1981 cggggttcaa gcgattctcc tgcctcagcc tcccgagttg ctgggactac aggtgagece 
2041 cagcatgccc agctaatttt ttttaatttt tagtagagac ggggtttcac catgttggcc 
2101 aggctggtct caaactcttg acctcaggtg atccgccgac ctcggcctcc caaagtcctc 
2161 ggattacagg catgagccac cacacccggc caatcagtcc acctttctag gccttggttc 
2221 ettgectgaa aaatgaaaga ggcgctggct ttccacagtg teatgetttg gcactttagc 
22 81 tatggttttc tttctgtgtg tgtgtaagcc actgettata ataaaaccaa caataccctc 
2341 agactgaaag ggcggaagtt attatctgea tctttatcaa ccccaagccc cacttcctcc 
2401 ctgacctccc catgccctcc ccagcctctc ccagcacaag tggggcaaag ccagcatgca 
2461 agcagacccc accaccacag cccacctccg tcctcacata cgtgcaggct ggctegggag 
2521 tccagtgagc agagcattgg acttggctgg ccagagggtc tctgagggca agagacatgg 
25 81 ccaaccaagg gcaaggagtg accctgtgga gggttctgcc gaactcaatg cagtgagaag 
2641 agggacaggg acaagtagtc cttgaaactg agccccattc tgaatccctg caggecaagt 
2701 cattgetcag ccaggactca gttcatgggg gaaacttgac ctgctgcagt ccctgagtct 
2 761 tgtcctcctg agaggaagee ctggcttcca aggctgggag ctggaggatg accttcggtc 
2821 ggtctgtctg ggttctccct gcagacagct tcctagctca tgcccatagc tcatgctccc 
2 8 81 tgccgagaaa gtggaggacg tggtacaggg ttgcagatgt ttagttttaa aaattcaatt 

2 941 ataaaaataa taaatgetea tgatagaaaa tttggaaagt gcaaataagc aaaaatgaaa 

3 001 acaattttaa aaatgtaaaa cctctcttgc cagggaatgg gggaagggca agtgaggagt 
3 061 tctttaatgg gtgaagagtt tcagttttgc aaaatgaaaa agttctggag atcagttgtg 
3121 caacaatatg aatatacata acaatactga actatacact gaaatggtta agatggtaca 
3181 ttttatgtta tgtgtatttt accacaattt ttataaaaag aggattaaat ctaaaggaaa 
3241 gaaaaaatta aaaccaccca taactttact ctgaagcagt aacagtggca tgtttcctcc 
33 01 taaaaaaaaa aaaaaaaaaa gaagaaaaaa aaataaagaa aaaaaaaaaa aaaa (SEQ ID 

NO:115) 
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GPR66 (NM_006056) 
MTPLCLNCSVLPGDLYPGGARNPMACNGSAARGHFDPEDLNLTD 

EALRLKYLGPQQTELFMPICATYLLIFVVGAVGNGliTCLVILRHKAMRTPTNYYLFSL 
AVSDLLVLLVGLPLELYEMWHNYPFLLGVGGCYFRTLLFEMVCLASVLNVTALSVERY 
VAWHPLQARSMVTRAHWRVLGAWGLAMLCSLPNTSLHGIQQLHVPCRGPVPDSAV 
CMLVRPRALYMVTVVQTTALL 

ARSRYTCRLQQHDRGRRQVTKMLFVLVWFGICWAPFHADRVMWSWSQWTDGLHLAF 
QHVHVISGIFFYLGSAANPVLYSLMSSRFRETFQEALCLGACCHRLRPRHSSHSLSRM 
TTGSTLCDVGSLGSWVHPIiAGNDGPEAQQETDPS (SEQ ID NO: 116) 
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SLC22A2 (NM_003 058) 

1 ctttgaagtc agctggacca aggaaaggcc ctgccctgaa ggctggtcac ttgcagaggt 

61 aaactcccct ctttgacttc tggccagggt ttgtgctgag ctggctgcag ccgctctcag 

121 cctcgctccg ggcacgtcgg gcagcctcgg gccctcctgc ctgcaggatc atgcccacca 

181 ccgtggacga tgtcctggag catggagggg agtttcactt tttccagaag caaatgtttt 

241 tcctcttggc tctgctctcg gctaccttcg cgcccatcta cgtgggcatc gtcttcctgg 

3 01 gcttcacccc tgaccaccgc tgccggagcc ccggagtggc cgagctgagt ctgcgctgcg 

3 61 gctggagtcc tgcagaggaa ctgaactaca cggtgccggg cccaggacct gcgggcgaag 

421 cctccccaag acagtgtagg cgctacgagg tggactggaa ccagagcacc ttcgactgcg 

481 tggaccccct ggccagcctg gacaccaaca ggagccgcct gccactgggc ccctgccggg 

541 acggctgggt gtacgagacg cctggctcgt ccatcgtcac cgagtttaac ctggtatgtg 

601 ccaactcctg gatgttggac ctattccagt catcagtgaa tgtaggattc tttattggct 

661 ctatgagtat cggctacata gcagacaggt ttggccgtaa gctctgcctc ctaactacag 

721 tcctcataaa tgctgcagct ggagttctca tggccatttc cccaacctat acgtggatgt 

781 taatttttcg cttaatccaa ggactggtca gcaaagcagg ctggttaata ggctacatcc 

841 tgattacaga atttgttggg cggagatatc ggagaacagt ggggattttt taccaagttg 

901 cctatacagt tgggctcctg gtgctagctg gggtggctta cgcacttcct cactggaggt 

961 ggttgcagtt cacagttgct ctgcccaact tcttcttctt gctctattac tggtgcatac 

1021 ctgagtctcc caggtggctg atctcccaga ataagaatgc tgaagccatg agaatcatta 

1081 agcacatcgc aaagaaaaat ggaaaatctc tacccgcctc ccttcagcgc ctgagacttg 

1141 aagaggaaac tggcaagaaa ttgaaccctt catttcttga cttggtcaga actcctcaga 

12 01 taaggaaaca tactatgata ttgatgtaca actggttcac gagctctgtg ctctaccagg 
1261 gcctcatcat gcacatgggc cttgcaggtg acaatatcta cctggatttc ttctactctg 
1321 ccctggttga attcccagct gccttcatga tcatcctcac catcgaccgc atcggacgcc 

13 81 gttacccttg ggctgcatca aatatggttg caggggcagc ctgtctggcc tcagttttta 
1441 tacctggtga tctacaatgg ctaaaaatta ttatctcatg cttgggaaga atggggatca 
1501 caatggccta tgagatagtc tgcctggtca atgctgagct gtaccccaca ttcattagga 
1561 atcttggcgt ccacatctgt tcctcaatgt gtgacattgg tggcatcatc acgccattcc 
1621 tggtctaccg gctcactaac atctggcttg agctcccgct gatggttttc ggcgtgcttg 
1681 gcttggttgc tggaggtctg gtgctgttgc ttccagaaac taaagggaaa gctttgcctg 
1741 agaccatcga ggaagccgaa aatatgcaaa gaccaagaaa aaataaagaa aagatgattt 
1801 acctccaagt tcagaaacta gacattccat tgaactaaga agagagaccg ttgctgctgt 
1861 catgacctag ctttgatggc agcaagacca aaagtagaaa tccctgcact catcacaaag 
1921 cccatacaac tcaaccaaac ttacccctga gccctatcaa cctaggtcta cagccagtgg 
1981 agtctattgt acactgtgga aaaataccca tgggaccaga tcctgccaaa ttcttccagc 
2041 tcactttatt ctcagcattc ctaggacatt ggacattggt tttctggagg gttttttttc 
2101 catctttgta tttttttaaa tttgattctt ttctttgcaa tgctatctaa ccagaataca 
2161 taggggaact gtgggctagg caaacaaaat agaaaaaagt gtgaaaaaca gtaaagttgg 
2221 gagaggagca tctattttct taaagaaata aaacacccaa aacaatataa agttgtccag 
22 81 aatgtatgtc aagaatttta gataggcctt tcagtaacac aggtgaagaa atttttaaaa 
2341 atacattgat tattatctag gttagactta aagtgaatct caaataaaag aatcaggaat 
2401 acaacttaag tgatcatgag gtccttccat atttagattg ggtaagcatg aatgtgtatt 
2461 ttctacaaaa gaccttgaga agagttcaat aaaaaatgtt agcattataa aa (SEQ ID 

NO: 117) 



FIGURE 62A 



WO 2004/044178 



PCT/US2003/036260 



109/115 

SLC22A2 (NM__003058) 
MPTTVDDVLEHGGEFHFFQKQMFFLLALLSATFAPIYVGIVFLG 

FTPDHRCRSPGVAELSLRCGWSPAKELNYTVPGPGPAGEASPRQCRRYEVDWNQSTFD 
CVDPIiASLDTNRSRLPLGPCRDGWWETPGSSIVTEFNLVCANSWMLDLFQSSWVGF 
FIGSMSIGYIADRFGRKLCLLTTVLINAAAGVLMAISPTYTWMLIFRLIQGLVSKAGW 
L I GYI LI TEFVGRR YRRTVGI F YQ VAYTVGIiLVLAGVAYALPHWRWLQFTVALPNFFF 
LLYYWCI PESPRWLI SQNKNAEAHRI IKHI AKKNGKSLPASLQRLRLEEETGKKLNPS 
FLDLVRTPQIRKHTMILMYlSmFTSSVLYQGLIMHMGLAGDNIYLDFFYSALVEFPAAF 
MI I LTI DR I GRR YPWAASNMVAGAACLAS VFI PGDLQWLKI I I SCLGRMGI TMAYE I V 
CLWAELYPTFIRNLGVHICSSMCDIGGIITPFLWRLTNIWLELPLMVFGVLGLVAG 
GliVLLLPETKGKALPETIEEAENMQRPRKNKEKMIYLQVQKDDIPLN (SEQ ID NO: 118) 
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NLSNl (NM_002420) 

1 gccctggcca aggaggaggc tgaaagagcc tgagctgtgc cctctccatt ccactgctgt 
61 ggcagggtca gaaatcttgg atagagaaaa ccttttgcaa acgggaatgt atctttgtaa 
121 ttcctagcac gaaagactct aacaggtgtt gctgtggcca gttcaccaac cagcatatcc 
181 cccctctgcc aagtgcaaca cccagcaaaa atgaagagga aaacaaacag gtggagactc 
241 agcctgagaa atggtctgtt gccaagcaca cccagagcta cccaacagat tcctatggag 
3 01 ttcttgaatt ccagggtggc ggatattcca ataaagccat gtatatccgt gtatcctatg 
3 61 acaccaagcc agactcactg ctccatctca tggtgaaaga ttggcagctg gaactcccca 
421 agctcttaat atctgtgcat ggaggcctcc agaactttga gatgcagccc aagctgaaac 
481 aagtctttgg gaaaggcctg atcaaggctg ctatgaccac cggggcctgg atcttcaccg 
541 ggggtgtcag cacaggtgtt atcagccacg taggggatgc cttgaaagac cactcctcca 
601 agtccagagg ccgggtttgt gctataggaa ttgctccatg gggcatcgtg gagaataagg 
661 aagacctggt tggaaaggat gtaacaagag tgtaccagac catgtccaac cctctaagta 
721 agctctctgt gctcaacaac tcccacaccc acttcatcct ggctgacaat ggcaccctgg 
781 gcaagtatgg cgccgaggtg aagctgcgaa ggctgctgga aaagcacatc tccctccaga 
841 agatcaacac aagactgggg cagggcgtgc ccctcgtggg tctcgtggtg gaggggggcc 
901 ctaacgtggt gtccatcgtc ttggaatacc tgcaagaaga gcctcccatc cctgtggtga 
961 tttgtgatgg cagcggacgt gcctcggaca tcctgtcctt tgcgcacaag tactgtgaag 
1021 aaggcggaat aataaatgag tccctcaggg agcagcttct agttaccatt cagaaaacat 
1081 ttaattataa taaggcacaa tcacatcagc tgtttgcaat tataatggag tgcatgaaga 
1141 agaaagaact cgtcactgtg ttcagaatgg gttctgaggg ccagcaggac atcgagatgg 
1201 caattttaac tgccctgctg aaaggaacaa acgtatctgc tccagatcag ctgagcttgg 
1261 cactggcttg gaaccgcgtg gacatagcac gaagccagat ctttgtcttt gggccccact 
1321 ggccgcccct gggaagcctg gcacccccga cggacagcaa agccacggag aaggagaaga 
13 81 agccacccat ggccaccacc aagggaggaa gaggaaaagg gaaaggcaag aagaaaggga 
1441 aagtgaaaga ggaagtggag gaagaaactg acccccggaa gatagagctg ctgaactggg 
1501 tgaatgcttt ggagcaagcg atgctagatg ctttagtctt agatcgtgtc gactttgtga 
1561 agctcctgat tgaaaacgga gtgaacatgc aacactttct gaccattccg aggctggagg 
1621 agctttataa cacaagactg ggtccaccaa acacacttca tctgctggtg agggatgtga 
1681 aaaagagcaa ccttccgcct gattaccaca tcagcctcat agacatcggg ctcgtgctgg 
1741 agtacctcat gggaggagcc taccgctgca actacactcg gaaaaacttt cggacccttt 
18 01 acaacaactt gtttggacca aagaggccta aagctcttaa acttctggga atggaagatg 

18 61 atgagcctcc agctaaaggg aagaaaaaaa aaaaaaagaa aaaggaggaa gagatcgaca 
1921 ttgatgtgga cgaccctgcc gtgagtcggt tccagtatcc cttccacgag ctgatggtgt 

19 81 gggcagtgct gatgaaacgc cagaaaatgg cagtgttcct ctggcagcga ggggaagaga 
2 041 gcatggccaa ggccctggtg gcctgcaagc tctacaaggc catggcccac gagtcctccg 
2101 agagtgatct ggtggatgac atctcccagg acttggataa caattccaaa gacttcggcc 
2161 agcttgcttt ggagttatta gaccagtcct ataagcatga cgagcagatc gctatgaaac 
2221 tcctgaccta cgagctgaaa aactggagca actcgacctg cctcaaactg gccgtggcag 
22 81 ccaaacaccg ggacttcatt gctcacacct gcagccagat gctgctgacc gatatgtgga 
2 341 tgggaagact gcggatgcgg aagaaccccg gcctgaaggt tatcatgggg attcttctac 
2401 cccccaccat cttgtttttg gaatttcgca catatgatga tttctcgtat caaacatcca 
2461 aggaaaacga ggatggcaaa gaaaaagaag aggaaaatac ggatgcaaat gcagatgctg 
2521 gctcaagaaa gggggatgag gagaacgagc ataaaaaaca gagaagtatt cccatcggaa 
2581 caaagatctg tgaattctat aacgcgccca ttgtcaagtt ctggttttac acaatatcat 
2641 acttgggcta cctgctgctg tttaactacg tcatcctggt gcggatggat ggctggccgt 
2701 ccctccagga gtggatcgtc atctcctaca tcgtgagcct ggcgttagag aagatacgag 
2 761 agatcctcat gtcagaacca ggcaaactca gccagaaaat caaagtttgg cttcaggagt 
2 821 actggaacat cacagatctc gtggccattt ccacattcat gattggagca attcttcgcc 

2 881 tacagaacca gccctacatg ggctatggcc gggtgatcta ctgtgtggat atcatcttct 
2941 ggtacatccg tgtcctggac atctttggtg tcaacaagta tctggggcca tacgtgatga 

3 001 tgattggaaa gatgatgatc gacatgctgt actttgtggt catcatgctg gtcgtgctca 
3 061 tgagtttcgg agtagcccgt caagccattc tgcatccaga ggagaagccc tcttggaaac 
3121 tggcccgaaa catcttctac atgccctact ggatgatcta tggagaggtg tttgcagacc 
3181 agatagacct ctacgccatg gaaattaatc ctccttgtgg tgagaaccta tatgatgagg 
3241 agggcaagcg gcttcctccc tgtatccccg gcgcctggct cactccagca ctcatggcgt 
3301 gctatctact ggtcgccaac atcctgctgg tgaacctgct gattgctgtg ttcaacaata 
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33 61 ccttctttga agtaaaatca atatccaacc 
3421 ttatgacatt tcatgacagg ccagtcctgc 
3481 acatcatcat tatgcgtctc agcggccgct 
3541 aacgggatcg tggattgaag ctcttcctta 
3 6 01 tcgaggagca gtgcgtgcag gagcacttcc 
3 661 gcgacgagcg catccgggtc acttctgaaa 
3 721 aaatcaatga aagagaaact tttatgaaaa 
3 7 81 ctcagctaga agaattatct aacagaatgg 
3 841 acaggtctga cctgatccag gcacggtccc 
3 901 ttctccggca aagcagcatc aatagcgctg 

3 961 acggagaaga gttattattt gaggatacat 

4 021 ggaaaaaaac ctgttccttc cgtataaagg 
4081 cagaatgtca gaacagtctt cacctttcac 
4141 gcagtcacct tgcagtagat gacttaaaga 
42 01 ttgggatttc aaaggaagat gatgaaagac 
4261 ccccaagttt aaataaaaca gatgtgatac 
4321 ctcagctaac agtggaaacg acaaatatag 
4381 ccaaaattac acgctatttc cccgatgaaa 
4441 gaagcttcgt ctattcccgg ggaagaaagc 
4501 acagttcaat cacggaccag caattgacga 
4561 cgcgctctca tagcacagat attccttaca 
4 621 ataaagagca gtttgcagat atgcaagatg 
4681 tccctcgctt gtccctaacc attactgaca 
4741 agccagatca aactttggga ttcccatctc 
4801 ggaatgtgaa atccattcag ggaaagttag 
4861 gcttagtaat tgtgtctgga atgacagcag 

4 921 ccacagaaac tgaatgctag tctgttttgt 
4981 ccactaatgg gtgtcatctt ggccatctaa 

5 041 taaaaaattt tggaaattca gacttgattt 
5101 ttagcatatg ttagtaggct tagttttttc 
5161 tactgtaacg aagataaatt ggctaatcag 
5221 gagggccacc aaatagccta ggaagtgccc 
5281 aagaagtaag caactagctg ggcacagtgg 
5341 gccaaggcag aaagatagct tgagtccagg 
5401 taccccatct cttaaaaaaa aaaaaaaaaa 



aggtgtggaa gttccagcga tatcagctga 
ccccaccgat gatcatttta agccacatct 
gcaggaaaaa gagagaaggg gaccaagagg 
gcgacgagga gctaaagagg ctgcatgagt 
gggagaagga ggatgagcag cagtcgtcca 
gagttgaaaa tatgtcaatg aggttggaag 
cttccctgca gactgttgac cttcgacttg 
tgaatgctct tgaaaatctt gcgggaatcg 
gggcttcttc tgaatgtgag gcaacgtatc 
atggctacag cttgtatcga tatcatttta 
ctctctccac gtcaccaggg acaggagtca 
aagagaagga cgtgaaaacg cacctagtcc 
tgggcacaag cacatcagca accccagatg 
acgctgaaga gtcaaaatta ggtccagata 
agacagactc taaaaaagaa gaaactattt 
atggacagga caaatcagat gttcaaaaca 
aaggcactat ttcctatccc ctggaagaaa 
cgatcaatgc ttgtaaaaca atgaagtcca 
tggtcggtgg ggttaaccag gatgtagagt 
cggaatggca atgccaagtt caaaagatca 
ttgtgtcgga agctgcagtg caagctgagc 
aacaccatgt cgctgaagca attcctcgaa 
gaaatgggat ggaaaactta ctgtctgtga 
tcaggtcaaa aagtttacat ggacatccta 
acagatctgg acatgccagt agtgtaagca 
aagaaaaaaa ggttaagaaa gagaaagctt 
ttctttaatt ttttttttta acagtcagaa 
acatcatcaa tttctaaaaa cattttccct 
acaatttaat gcactaaaag tagtattttg 
agttgcagta gtatcaaatg aaagtgatga 
tatacaagat tatacaatct ctttattact 
tcgagcactg aagtcaccat taggtcactt 
ctcatgcctg taatcctagc actttgggag 
agtttgagac cagcctgggc aacatagtga 
a (SEQ ID NO: 119) 
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NLSN1 (NM__0 02420) 
MYIRVSYDTKPDSLLHLiyr^KDWQLELPKLLISVHGGLQNFEMQP 

KLKQVFGKGLIKAAMTTGAWIFTGGVSTGVISHVGDALKDHSSKSRGRVCAIGIAPWG 
I VENKEDLVGKDVTRVYQTMSNPL S KL S VL3STNSHTHF I LADNGTLGKYGAE VKLRRLL 
EKHI SLQKINTRLGQGVPLVGLVVEGGPNVVS I VXiE YLQEEP P I PWI CDGSGRASD I 
LSFAHKYCEEGGIINESLREQLIiVTIQKTFNYNKAQSHQLFAIIMECMKKKELVTVFR 
MGSEGQQDIEMAILTALLKGTNVSAPDQLSLALAWNRVDIARSQIFVFGPHWPPLGSIi 
APPTDSKATEKEKKPPMATTKGGRGKGKGKKKGKVKEEVEEETDPRKI ELLNWVNALE 
QAMLDALVLDRVDFVKXjLIENGVTS^Q 
NLPPDYHISLIDIGLVLEYLMGGAYRCNYTR^ 

E P P AKGKKKKKKKKEEE I D I DVDDP AVS RFQ YP FHELIVn/WAVLMKRQKMAVFLWQRGE 

ESMAKALVACKXiYKAMAHESSESDLTO^ 

AMKXjLTYELKNWSNSTCLKLAVAAK^ 

MGILLPPTILFLEFRTYDDFSYQTSKENEDGKEKEEENTDANADAGSRKGDEENEHKK 

QRSIPIGTKICEFYNAPIVKFWFYTISYLGYLLLFNYVILVRMDGWPSLQEWIVISYI 

VSLALEKIREILMSEPGKLSQKIKVWLQEYWNITDLVAISTFMIGAILRLQ 

GRVI YCVD 1 1 FW YI RVLD I FGVNKYLGP YVMMI GKMMI DMLYFWI MLWLMS FGVAR 

QAILHPEEKPSWKLARNIFYMPYWMIYGEVFADQIDLYAMEINPPCGENLYDEEGKRL 

PPCI PGAWLTPALK^CYLLVANILLVNLLI AVFNNTFFEVKS I SNQVWKFQRYQLIMT 

FHDRPVLPPPMIILSHIYIIIMRIiSGRCRKKREGDQEERDRGLKLFLSDEELKRLHEF 

EEQCVQEHFREKEDEQQSSSDERIRVTSERVENMSMRLEEINERETFMKTSLQTVDLR 

LAQLEELSNRMVNALENLAGIDRSDLIQARSRASSECEATYLLRQSSINSADGYSLYR 

YHFNGEELLFEDTSLSTSPGTGVRKKTCSFRIKEEKDVKTHLVPECQNSLHLSLGTST 

SATPDGSHLAVDDLKNAEESKLGPDI GI SKEDDERQTDSKKEETI S PSLNKTDVIHGQ 

DKSDVQNTQLTVETTNIEGTISYPLEETKITRYFPDETINACKTMKSRSFVYSRGRKL 

VGGVNQDVEYS S I TDQQLTTEWQCQVQKI TRSHSTDI PYI VSEAAVQAEHKEQFADMQ 

DEHHVAEAIPRIPRLSLTITDRNGMENLLSVKPDQTLGFPSLRSKSLHGHPR3STVKSIQ 

GKLDRSGHASSVSSLVIVSGMTAEEKKVKKEKASTETEC (SEQ ID NO: 120) 
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ATN2 (Na/K transport, NM_000702 ) 

1 tctctgtctg ccagggtctc cgactgtccc agacgggctg gtgtgggctt gggatcctcc 
61 tggtgacctc tcccgctaag gtccctcagc cactctgccc caagatgggc cgtggggctg 
121 gccgtgagta ctcacctgcc gccaccacgg cagagaatgg gggcggcaag aagaaacaga 
181 aggagaagga actggatgag ctgaagaagg aggtggcaat ggatgaccac aagctgtcct 
241 tggatgagct gggccgcaaa taccaagtgg acctgtccaa gggcctcacc aaccagcggg 
3 01 ctcaggacgt tctggctcga gatgggccca acgccctcac accacctccc acaacccctg 
361 agtgggtcaa gttctgccgt cagcttttcg gggggttctc catcctgctg tggattgggg 
421 ctatcctctg cttcctggcc tacggcatcc aggctgccat ggaggatgaa ccatccaacg 
481 acaatctata tctgggtgtg gtgctggcag ctgtggtcat tgtcactggc tgcttctcct 
541 actaccagga ggccaagagc tccaagatca tggattcctt caagaacatg gtacctcagc 
6 01 aagcccttgt gatccgggag ggagagaaga tgcagatcaa cgcagaggaa gtggtggtgg 
661 gagacctggt ggaggtgaag ggtggagacc gcgtccctgc tgacctccgg atcatctctt 
721 ctcatggctg taaggtggat aactcatcct taacaggaga gtcggagccc cagacccgct 
781 cccccgagtt cacccatgag aaccccctgg agacccgcaa tatctgtttc ttctccacca 
841 actgtgttga aggcactgcc aggggcattg tgattgccac aggagaccgg acggtgatgg 
901 gccgcatagc tactctcgcc tcaggcctgg aggttgggcg gacacccata gcaatggaga 
961 ttgaacactt catccagctg atcacagggg tcgctgtatt cctgggggtc tccttcttcg 
1021 tgctctccct catcctgggc tacagctggc tggaggcagt catcttcctc atcggcatca 
1081 tagtggccaa cgtgcctgag gggcttctgg ccactgtcac tgtgtgcctg accctgacag 
1141 ccaagcgcat ggcacggaag aactgcctgg tgaagaacct ggaggcggtg gagacgctgg 
1201 gctccacgtc caccatctgc tcggacaaga cgggcaccct cacccagaac cgcatgaccg 
1261 tcgcccacat gtggttcgac aaccaaatcc atgaggctga caccaccgaa gatcagtctg 
1321 gggccacttt tgacaaacga tcccctacgt ggacggccct gtctcgaatt gctggtctct 
13 81 gcaaccgcgc cgtcttcaag gcaggacagg agaacatctc cgtgtctaag cgggacacag 
1441 ctggtgatgc ctctgagtca gctctgctca agtgcattga gctctcctgt ggctcagtga 
15 01 ggaaaatgag agacagaaac cccaaggtgg cagagattcc tttcaactct accaacaagt 
1561 accagctgtc tatccacgag cgagaagaca gcccccagag ccacgtgctg gtgatgaagg 
1621 gggccccaga gcgcattctg gaccggtgct ccaccatcct ggtgcagggc aaggagatcc 
1681 cgctcgacaa ggagatgcaa gatgcctttc aaaatgccta catggagctg gggggacttg 
1741 gggagcgtgt gctgggattc tgtcaactga atctgccatc tggaaagttt cctcggggct 
1801 tcaaattcga cacggatgag ctgaactttc ccacggagaa gctttgcttt gtggggctca 
1861 tgtctatgat tgaccctccc cgggctgctg tgccagatgc tgtgggcaag tgccgaagcg 
1921 caggcatcaa ggtgatcatg gtaaccgggg atcaccctat cacagccaag gccattgcca 
1981 aaggcgtggg catcatatca gagggtaacg agactgtgga ggacattgca gcccggctca 
2 041 acattcccat gagtcaagtc aaccccagag aagccaaggc atgcgtggtg cacggctctg 
2101 acctgaagga catgacatcg gagcagctcg atgagatcct caagaaccac acagagatcg 
2161 tctttgctcg aacgtctccc cagcagaagc teat cat tgt ggagggatgt cagaggcagg 
2221 gagecattgt ggccgtgacg ggtgacgggg tgaacgactc ccctgcattg aagaaggctg 
2281 acattggcat tgccatgggc atctctggct ctgacgtctc taagcaggca gecgacatga 
2341 tcctgctgga tgacaacttt gcctccatcg teaegggggt ggaggagggc cgcctgatct 
2401 ttgacaactt gaagaaatcc atcgcctaca ccctgaccag caacatcccc gagatcaccc 
2461 ccttcctgct gttcatcatt gccaacatcc ccctacctct gggcactgtg accatccttt 
2521 gcattgacct gggcacagat atggtccctg ccatctcctt ggectatgag gcagctgaga 
2581 gtgatatcat gaagcggcag ccacgaaact cccagacgga caagctggtg aatgagaggc 
2 641 tcatcagcat ggcctacgga cagateggga tgatccaggc actgggtggc ttcttcacct 
2701 actttgtgat cctggcagag aacggtttcc tgccatcacg gctactggga atccgcctcg 
2761 actgggatga ccggaccatg aatgatctgg aggacagcta tggacaggag tggacctatg 

2 821 ageageggaa ggtggtggag ttcacgtgcc acaeggcatt ctttgccagc atcgtggtgg 
2881 tgcagtgggc tgacctcatc atetgeaaga cccgccgcaa ctcagtcttc cagcagggca 
2941 tgaagaacaa gatcctgatt tttgggctcc tggaggagac ggcgttggct gcctttctct 

3 001 cttactgccc aggcatgggt gtagccctcc gcatgtaccc gctcaaagtc acctggtggt 
3 061 tctgcgcctt cccctacagc ctcctcatct tcatctatga tgaggtccga aagctcatcc 
3121 tgcggcggta tcctggtggc tgggtggaga aggagacata ctactgaccc cattggaaga 
3181 agaaccaggc atggaaagat ggggagctct ggaggtgttg tggggatggt gatggagagg 
3241 gatggaaata acgggtggca ttgggtggca acatttgggg agagataatg aggcaactca 
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33 01 gcaggctaag ttgcggggta tataaatt gg ggtgatgacc ccatagacct aactgtgaac 

3361 aatcagatta gacactatgt gttagagtcc ccccgaccag atccttttcc atcccactcc 

3421 actatgttgt ctattttttc tgaggaatta agggttaccc caccctgccc actcccatcc 

3481 cttcaacccc acttcctact gtaatagatc agcatccaaa agcaggaacc catctaaacc 

3541 agaaggaagc cctctcagat caccccagcc tcactccatt tcccacttcc acccccgtta 

3601 gcttcctgca ggactctatc cctggcttcc ccttcagacc ttgcaatcac aaaaggttct 

3 661 tctggtgagt gcaagagcct gagactggaa aaggtggact tgtctcccag tcgaggctgg 

3 721 taagggacct tcagggagag ctgggcagac aggtgggaga tggaggtagg gctggctgga 

3 781 ggaaggaaac aacaaaggaa gtgaggtagt gccaatgaca ggacatttga catgagtctc 

3 841 cagatagatg tcgtggactc cagctctacg tcccacattt tagaataccc caccagcaga 

3 901 acaaactcag atctcatcag ggtagcagca gaggcaggac cagaaggcaa tcaagagctt 

3 961 ccagaaatgc cacacttgtg tgccacagag ttccccgctg acccttggtt aggggtcctc 
4021 ttagtccaca aggtccggat gtcactcatg tacttaataa cacttcacct tctgtaatac 
4081 taagtcctca gagctccatg ctgttctgaa agggatggcc acaagttctt tcccagcctc 
4141 ttccattccc tttcttttca tgcccatccc gatgaacctg catcattccc cgacactgcc 

42 01 aagccaaccc tggaaaagga gttcgctggc cattggctag aatcagggtg gagaagttcc 
4261 ctgaaccttc ctgtctccca gggacatgta tgcttccagg gacaagctta ggtcatgaac 
4321 atggtcagaa cctttggaca agaggaaaaa tactaagaga tttgcttttt ctgggtgcgg 

43 81 tggctcatgc ctgtaatccc agcactttgg gaggccgagg caggtggatc atgaggtcag 
4441 gagttcgagg cgagcctggc caacatggtg aaaccctgtc tctactaaaa gtacaaaaaa 
45 01 ttagccagtc atggtggcac acgcctgtaa tctcagctac tcaggaggct gaggcaggag 
4561 aattgcttga acctgtgagg aagaggttgc agtgagctga gatcgtgcca ttacactcca 
4621 gcctgggcga aagggtgaga ctccatctca aaaaaaaaaa aaatgatttg cttttgacgt 
4681 cttaggtggc agggctgttc cctccaggca aatgcccttc aaaccgacga tcattgtgcc 
4741 cacttaccct gggctggaga gttggtttca ggttcctaca ggagatagct ttctttccct 
4801 tactccctat ctaacacttt tgctctgcag gcagccttgc ccattctcta agcctggctt 
4861 agaaggcact gggaatgtcc tgtagagaga gacctagata ggtcatgcaa gtgagaaaga 

4 921 catctgagga aaatggaaga cctaaggcag acaggaagga agcacaaaag acaagcattg 
4981 ggtcagaccc ataaaccacc tcccaaaggc tgtcatttca ttgcactgga attttgcttt 
5041 atcagaagca aggaagtaag ggagtcattg ccttgggcct gggaatctaa gtgggagaca 
5101 atattaattt ggatccgatt aattggagat tactaactgt ggacaaaagt ttatctttgc 
5161 acaatcaata aaaatggcat ttttttagta aattaagagc ataaacaata ttgctagagg 
5221 tggcatgttt agtctaccaa aaacaatact tttcaggcac tttagaaata tccttttaga 
5281 agcagcgagt gcatgggcta attatcatca atctttatgt atttgttaaa gaaacatcta 
5341 caggatcttt attggtgacc ttttgtaaga cattagtttg aggtactacc tatctacttg 
5401 aaaataataa agtggcattt ctttatgaaa aaaaaagaaa tctcttccat aattcagatt 
5461 tctacacttt atacttgcct ccctcctaaa tcgtgatatt gaaatatggt g (SEQ ID 

NO: 121) 
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ATN2 (Na/K transport, NM_000702) 
MGRGAGREYSPAATTAENGGGKKKQKEKELDELKKEVAMDDHKL 

SLDELGRKYQVDLSKGLTNQRAQDVLARDGPNALTPPPTTPEWVKFCRQLFGGFSILL 
WIGAILCFLAYGIQAAMEDEPSNDNLYLGVVLAAVVIVTGCFSYYQEAKSSKIMDSFK 
NMVPQQALVIREGEKMQINAEEVWGDLVEVKGGDRVPADLRIISSHGCKVDNSSLTG 
ESEPQTRSPEFTHENPLETRNICFFSTNCVEGTARGIVIATGDRTVMGRIATLASGLE 
VGRTP I AME I EHF I QLI TGVAVFLGVS FF VLSL I LGYSWLEAVT FL I GI I VANVPEGL 
LATVTVCLTLTAKRMARKNCLVKNLEAVETLGSTSTICSDKTGTLTQNRMTVAHMWFD 
NQIHEADTTEDQSGATFDKRSPTWTAliSRIAGLCNRAVFKAGQENI SVSKRDTAGDAS 
E SALLKC I ELSCGS VRKMRDRNPKVAE I PFNSTNKYQLS I HEREDS PQSHVLVMKGAP 
ERILDRCSTILVQGKEIPLDKEMQDAFQNAYMELGGLGERVLGFCQLNLPSGKFPRGF 
KFDTDELNFPTEKLCFVGLMSMIDPPRAAVPDAVGKCRSAGIKVIMVTGDHPITAKAI 
AKGVGIISEGNETVEDIAARLNIPMSQVWPREAKACVVHGSDLKDMTSEQLDEILKNH 
TEIVFARTSPQQKLIIVEGCQRQGAIVAVTGDGWDSPALKKADIGIAMGISGSDVSK 
QAADMILLDDNFASIVTGVEEGRLIFDNLKKSIAYTLTSNIPEITPFLLFIIANIPLP 
LGT VT I L C I DLGTDMVPAI S LAYE AAE S D I MKRQ PRNS QTDKL VNERL I SMAYGQ I GM 
IQALGGFFTYFVIItAENGFLPSRLLGIRLDWDDRTMNDLEDSYGQEWTYEQRKWEFT 
CHTAFFAS I WVQWADLI I CKTRRNSVFQQGMKNKILI FGLLEETALAAFLS YCPGMG 
VALRMYPLKVTWWFCAF P YSLL I F I YDE VRKL I LRR YPGGWVEKETYY (SEQ ID NO: 122) 
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